Synchrotron Radiation Circular Dichroism (SRCD) spectroscopy: New beamlines and new applications in biology

New advances in instrumentation, demonstration of proof-of-principle studies, and development of new tools and methods for data analysis and interpretation have enabled the technique of Synchrotron Radiation Circular Dichroism (SRCD) spectroscopy to become a useful tool for structural and functional biology. This paper discusses the characterisation of two new SRCD beamlines, CD1 at the Institute for Storage Rings (ISA), Denmark and 4B8 at the Beijing Synchrotron Radiation Facility (BSRF), China, and new applications of the method for examining biological systems.


Introduction
Circular dichroism (CD) spectroscopy has been a valuable tool in chemical, biochemical and structural biology studies for more than 40 years.It is regularly used for examining protein secondary structures, dynamics and folding, monitoring conformational changes associated with ligand binding and macromolecular interactions, and is an essential element of the well-founded biophysics laboratory.
A technological development that has significantly enhanced the technique of CD is the use of synchrotron radiation as its light source [1,2], resulting in a technique now known as Synchrotron Radiation Circular Dichroism (SRCD) [3].The first SRCD beamlines that were developed more than 20 years ago [1,2] showed the viability of the method, although the full potential of the technique for biological studies [4] was not exploited for some years whilst the instrumentation was refined, cross-calibration studies were undertaken [5,6] and enabling methods for analysis and interpretation of the results were developed [7][8][9][10].SRCD has now become a distinct technique in its own right, due to the additional types of studies it enables and the more extensive data that it can produce, relative to conventional CD (cCD) spectroscopy using lab-based commercial instruments [11].
Advantages of SRCD over cCD include the capacity to measure lower wavelength vacuum ultraviolet (VUV) data, the production of spectra with higher signal-to-noise ratios which enable the use of smaller amounts of protein samples and the ability to detect smaller changes accurately, and the improved capability of measuring samples in the presence of additives such as buffers, detergents and lipids, a feature especially important for membrane proteins.
SRCD was still an emerging technique for structural biology when its advantages and potential applications were first reviewed in 2000 [3].The characteristics of beamlines existing at the time (beamline 3.1 at the Synchrotron Radiation Source (SRS), Daresbury, UK and beamlines U11 and U9b at the National Synchrotron Light Source (NSLS), Brookhaven, USA) and of those in development (CD12 at the SRS -an upgraded replacement for 3.1, and UV1 at the Institute for Storage Ring facilities (ISA), Denmark) were reported.Soon thereafter, several additional beamlines were in planning, design or construction stages worldwide [12].The characteristics of some of the new or planned beamlines have since been described [13][14][15][16][17].The CD12 beamline at the SRS and the UV1 beamline at ISA have been major sources of SRCD publications in the past 7 years, with HiSOR (Japan) BL15, NSLS U11 and U9b, BESSY2 (Germany), and the early BSRF (China) 3B1B beamline also contributing significantly to the world wide SRCD output.This is a report on two new beamlines that have recently become available for SRCD users, and our studies on their characteristics (Table 1).There are currently a number of other beamlines in development around the world, including DIAMOND (UK) -the successor to the SRS, SOLEIL (France), Hefei (China), Melbourne (Australia) and the NSRRC (Taiwan), amongst them.This rapid growth in the availability of new SRCD beamlines and facilities is reflective of the potential and realisation of the utility of the technique for structural biology and structural and functional genomics [3,4,11].This paper also describes some of those new applications of SRCD and their impact in structural biology.

Materials
Horse skeletal myoglobin and hen egg white lysozyme were purchased from Calbiochem and Worthington Biochemical Corp., respectively.Concanavalin A, human serum albumin (HSA), and camphor sulphonic acid (CSA) were obtained from Sigma-Aldrich Ltd.

Sample preparation
The proteins were allowed to dissolve in deionised water overnight at a concentration of ∼8 mg/ml.The solutions were centrifuged to remove any undissolved material and then degassed to remove any dissolved oxygen.The final concentrations of the proteins were determined from the A 280 value [18] using duplicate measurements with a Nanodrop D-1000 UV spectrophotometer.
Solutions of CSA were freshly prepared at a concentration of 5.5 mg/ml (determined from the extinction coefficient at 285 nm [19]) and stored at 4 • C in the dark.

SRCD measurements
Protein samples (and their corresponding deionised water baselines) were examined in circular 0.0015 cm pathlength demountable Suprasil cells (Hellma UK, Ltd), which had been previously calibrated using interferometry methods [6].CSA samples (and their corresponding deionised water baselines) were examined at 25 • C in 0.01 cm pathlength cells over the wavelength range from 320 to 185 nm.
At 4B8, 3 repeats of each of the protein samples were measured at 5 • C over the wavelength range from 280 to 165 nm, using a 1 nm interval and a time constant of 5 s.At CD1 spectra were measured using an interval of 1 nm and dwell time of 2.1 s.Three repeats of each protein spectrum were measured from 280 nm to 168 nm at a temperature of 20 • C. At CD1, the best signal-to-noise ratios were obtained with an exit slit width of 0.5 mm, and protein integrity was maintained when the spot size at the sample was enlarged from 1 × 2 mm to 2 × 6 mm achieved by increasing the distance of the sample from the focal point of the beam.At both beamlines the effective wavelength cut-off limit, defined by the HT reading as described in [4], was ∼172 nm for all the proteins.
Ten consecutive repeats (with no delay time between spectra) were collected for the HSA samples at each beamline.

Data processing and analysis
Spectral data from both beamlines were processed using identical procedures with CDtool software [7].It is useful to note here that the data formats from both beamlines were compatible with the current version of CDtool available at http://cdtools.cryst.bbk.ac.uk.Except for the HSA samples, repeated spectra and their corresponding baselines were each averaged, smoothed with the Savitsky-Golay filter, subtracted from each other, zeroed between 260 and 267 nm, and scaled to delta epsilon units.The spectra from 4B8 were blue-shifted by 1 nm to take into account later wavelength calibration measurements.The individual repeated scans of HSA were not averaged nor baseline-subtracted, but overlaid to show any changes that occurred as a function of beam exposure time.

CSA
The ratios of the CSA peaks at ∼192.5 and 290 nm measured on CD1 and 4B8 were 2.00 and 1.98, respectively.These values fall within the range of ratios (1.96 to 2.15) measured previously on other SRCD beamlines [5], and also correspond closely to the most often-quoted ratio of 2.00 expected for cCD instruments (lab-based instruments are usually calibrated to this value).This means that the data obtainable on either of these beamlines should be comparable in terms of optical rotation to data obtained on other CD instruments (whether SRCD beamlines or commercial lab-based machines) and to the spectra in the SP175 reference dataset [9].

Protein spectra
The myoglobin spectra (Fig. 1) measured at both beamlines were essentially identical to each other and to myoglobin spectra collected on other well-calibrated beamlines [5].Because the two beamlines were found to have similar CSA ratios, there was no need to further cross-calibrate them [5] to match each other or existing spectra in the SP175 reference dataset [9].
The only significant difference in the spectra was that the noise level for CD1 at wavelengths > 230 nm was lower (estimated from the rms peak-to-peak measurements at wavelengths where there was no CD signal) than that for 4B8 at these wavelengths.This is because the flux of 4B8 is lower at high wavelengths (>230 nm) than at low wavelengths (Table 1), whereas CD1 has similar flux over all wavelengths measured in these spectra.This has minimal effect on the overall accuracy of the spectra because there are no significant protein transitions at these wavelengths, but might be an issue in studies of nucleic acids, where important transitions are found at near UV wavelengths.
In these studies the effective cut-off limit at both beamlines was 172 nm, as opposed to the 160-170 nm that has been reported for some other beamlines.This is in part because quartz Suprasil cells were used.If future studies use CaF 2 cells [20], it is expected the cut-off limit can be lowered to 170 nm.
Myoglobin, a primarily alpha-helical protein, was chosen for these comparisons as it has been previously used for cross-calibration studies [5].However, in this study, two other proteins, a mixed alphabeta protein, lysozyme, and a primarily beta-sheet protein, concanavalin A, were also examined (data not shown).Similar results were also obtained for both of these proteins on both of these beamlines that Fig. 1.SRCD spectrum of horse myoglobin obtained on beamline 4B8 (black line) overlaid on the myoglobin spectrum in the SP175 reference dataset (grey line) [9].Note that the 4B8 spectrum presented has been blue-shifted by 1 nm (to take into account subsequent wavelength calibration).
were consistent with our previous measurements [9] on UV1 and CD12, although for the latter beamline it was necessary to do three separate loadings of the cells rather than three repeat scans, because of protein denaturation (see below).

Examination of whether beam-induced denaturation was present
A major concern for high-flux SRCD beamlines is the potential heating of the sample that can arise due to the large input of energy at low wavelengths (∼155-175 nm) corresponding to the absorption peak of the water that is used as a solvent [21].This in turn can cause heating, and subsequently denaturation of, the protein present in the sample.This has been shown not to occur on lower energy beamlines such as 3.1, U9b or UV1 [22].It is however, a significant problem for some higher flux beamlines such as CD12, and although its effects can be partially mitigated in some cases by attenuating the beam [23], it remains a concern for samples that are subjected to repeated scans (which should be done for good practice, to enable averaging and error level estimation [4]).It is a continuing issue as new beamlines are built on higher energy synchrotrons (i.e.DIAMOND and SOLEIL).Hence it is important to characterise each new beamline with respect to its capability to cause denaturation.Previous studies [21] have shown that the protein HSA is particularly sensitive to such effects, and hence an excellent test sample.In the present study, it was shown that over the course of several hours, during the collection of 10 consecutive spectra of HSA, no degradation in signal was observed (Fig. 2(a) and (b)) for either beamline, although a minor amount of denaturation was seen (data not shown) on CD1 if the sample was placed at the focal point of the beam.The lack of denaturation on these two beamlines is in contrast to previous data taken under the same conditions at CD12 (Fig. 2(c)), and is a very good indicator that these new beamlines will be suitable for a wide range of spectral experiments, including stopped-flow and thermal denaturation folding studies.

Characteristics of new beamlines compared to existing beamlines
Two new SRCD beamlines have recently come on line, CD1 at ISA and 4B8 at the BSRF.These new beamlines show ideal characteristics for SRCD data collection: cross-calibration studies produced spectra matching the characteristics of spectra produced on existing well-calibrated beamlines and present in existing reference datasets.They also have the distinct advantage over some very high flux beamlines in that they do not cause heat-induced denaturation of proteins during data collection.As more new beamlines join the cadre of SRCD sites worldwide, it would be advantageous for their users if similar characterisation studies on them were made publicly available.

Improved structure determination by SRCD
The use of SRCD in biology, as indicated by the number of published studies, is rapidly expanding.This is not only due to the increasing number of beamlines being built for this technique, but also due to the availability of new tools for analysis, especially the creation of a new reference dataset which includes the low VUV wavelength data present in SRCD spectra.
The SP175 reference dataset containing the SRCD spectra of >70 soluble proteins [9] is now available through the DICHROWEB server [8].SP175 has a low-wavelength cut-off of 175 nm and was designed to broadly cover a wide range of secondary structures and fold-space (the latter based on the classifications in the CATH protein structure database [24]) for soluble proteins.Cross-validation of the dataset indicates that SP175 improves secondary structure prediction when compared to the only other publicly-available low wavelength reference dataset (containing 23 spectra to 178 nm [25]).Since SP175 contains three times as many spectra, some of the improvement may be ascribed simply to the increased structural and spectral diversity in this bioinformatics-designed reference dataset, however, truncating the wavelength range of this dataset shows that the lower wavelength data also add to the improvements in the analyses.These low wavelength data have many other advantages for examining and interpreting the secondary structure of proteins.The extended wavelength range present in SP175 improves the analyses of betasheet rich proteins and proteins which contain considerable amounts of polyproline II or irregular secondary structure [10].This is because in the higher wavelength part of the far UV region (∼205-220 nm), the large signals produced by helical structures tend to swamp out the smaller signals due to sheets and irregular structures, thus making analyses of those two components less accurate whenever there is a significant amount of helix present.However, because the signals from those components have opposite signs from those of the helical component in the VUV region, analyses for them are much more accurate when SRCD data is available.Also, spectra with the low wavelength data contain additional eigenvectors of information [11,26] and hence enable the analyses of more specific types of secondary structure, i.e. 3 10 helices versus alpha-helices, and different types of beta-sheets [10], thus providing more detailed information on the protein secondary structures present.
The newest analytical development using the SP175 dataset has been the use of cluster analysis ([9], Miles and Wallace, in preparation) to gain additional information on protein folds from the SRCD data.Spectra that include data down to 175 nm form cogent groups containing similar CATH classes, architectures and sometimes topologies.For example beta-barrels and beta-sandwiches are largely segregated and, likewise, mixed alpha-beta sandwiches and alpha-beta barrels form separate clusters.The data that provide this information are from the low wavelength range, since if the dataset is truncated at 190 nm only three major clusters are formed, corresponding to helical, sheet and mixed architectures.
A further, and unexpected advantage of including VUV data is the improved analysis of spectra for which the precise protein concentration is not known [10,27], something that can otherwise have devastating effects on the analyses of cCD data with existing algorithms.
Finally, for protein folding/unfolding studies SRCD offers an improvement over conventional CD since much of the identifiable structural information from denatured samples is in the far UV, where the presence of denaturants with high absorbances such as urea and guanidine HCl prohibit such measurements in conventional instruments [4,9,28].
In addition to the new wide-ranging SP175 dataset, new narrower, focused datasets have also been created with the aim of improving analyses of specific classes of proteins that are not well-analysed by standard datasets because the proteins have unusual or specific characteristics.One example of such a dataset is CRYST175 [29].It contains the spectra of nine proteins belonging to the βγ-crystallin family of eye lens proteins.These proteins have a distinctive double Greek-key fold.This narrow reference dataset provides greatly improved results for the limited number of proteins with such fold characteristics.Datasets of this kind may be particularly useful for examining mutants and homologues from other species.Other such focused datasets include membrane proteins [30] and denatured proteins [28].

New applications of SRCD for proteins
In parallel with these advances in data collection, processing, reference datasets and methods of analysis have come demonstrations of new types of experiments and applications of SRCD for examining interesting biological questions.
The following briefly summarises some of the recent (since 2000) structural biology studies enabled by this technique.
SRCD has permitted the identification of very subtle structural differences not detectable by cCD, including the differences between wild type and cataract-causing mutants of human eye lens proteins [31], differences between metmyoglobin in aqueous and helix-promoting organic solvents [32], changes associated with protein-drug binding [33,34], and the identification of complex formation between two proteins even when no secondary structural changes occur [35].
Another area in which SRCD is proving useful is in thermal studies of protein folding/unfolding.Because the low wavelength data enables measurements of additional transitions, and the high flux permits measurements in high ionic strength solutions, it has been possible for example to detect a heretofore unseen folding intermediate in tropomysin, which differs from a mutant version of the protein [36] and to investigate the stabilisation of folding intermediates by chaperones [37].
Other uses have been as a test of homology modelling of expressed structural domains [38] and a means of examining proteins in organic solvents [39].
SRCD has also been shown to be helpful for examining physically challenging systems for which scattering artefacts can predominate in cCD, including fibrous proteins where it has been used to follow the conversion of spider silk liquid to solid fibres [40], and membrane proteins, which are present as large detergent/lipid/protein complexes [30].
A particularly fruitful new use is the combination of complementary techniques.For example two synchrotron techniques, SRCD and SAXS, have been combined to examine both secondary structures and tertiary structures of individual proteins and complexes [41][42][43].In another recent study, kinetic SRCD studies were combined with single molecule fluorescence to follow the dynamics of an ensemble of collapsed unfolded proteins [44].
Whilst not comprehensive, the above list demonstrates the wide range of structural biology investigations that have employed SRCD, many of which would not have been possible with conventional circular dichroism spectroscopy using lab-based instruments.

New applications of SRCD for other macromolecules
Although the majority of the SRCD studies to date have been on proteins, it is clear that CD studies of other types of macromolecules will also benefit from the low wavelength data available in SRCD spectra.
CD studies have been important for identifying backbone conformations of DNA molecules, especially from the near UV region of the spectra.Early SRCD studies showed the presence of additional bands below 180 nm [45].More recent SRCD studies of nucleosides and nucleotides have shown that the bases apparently contribute more to the VUV spectra than the sugars.The effects of pH and temperature suggested the VUV bands are strongly sensitive to structural modification and chemical environment [46].
Other studies have shown that sugars have low CD signals in the far UV (an advantage for cCD studies that tend to ignore sugar components of glycoproteins), but generally do have transitions that give rise to significant signals in the VUV region.Their peak positions, signs and magnitudes of signals are indicative of the types and configurations of the sugars present both on their own and as part of nucleotides [47][48][49], and could be used in the future to identify components of complex sugar samples.SRCD data have also been used to interpret VUV spectra of glycoproteins to determine the contributions from the sugar components and their effects on these protein structures [34,50], and to examine the formation of protein-sugar complexes [51].

Potential for developments in the future: The Protein Circular Dichroism Data Bank (PCDDB)
A further new development that may aid interpretation and analyses is the PCDDB [52].This is a public archive being created for SRCD data measured at all beamlines, as well as CD spectra from conventional instruments.It should enable a range of new bioinformatics and structural biology studies.It is an undertaking involving many of the SRCD beamlines as mirror deposition and access sites [12], and is a further demonstration of cross-cooperation within the growing SRCD community.In addition to the obvious advantage of accessibility to data for published validated spectra, it will have the added advantage of providing a ready source of spectra that can be used to create both broader-based (like SP175) and more narrowly-focused (like CRYST175) reference datasets to further improve analyses.
In summary, the availability of new beamlines, analysis tools, and of proof-of-principle studies have resulted in SRCD spectroscopy becoming an important new tool in structural biology.

Fig. 2 .
Fig. 2. Plots of 10 consecutive SRCD scans of human serum albumin obtained on (a) CD1, (b) 4B8 and (c) for comparison, scans previously obtained [21] on CD12.For each plot the first and last scan are black solid and dashed lines, respectively, and the intermediate scans are in grey.

Table 1
* Determined in this study.