Although big progress has been made in sample pretreatment over the last years, there are still considerable limitations when it comes to overcoming complexity and dynamic range problems associated with peptide analyses from biological matrices. Being the little brother of proteomics, peptidomics is a relatively new field of research aiming at the direct analysis of the small proteins, called peptides, many of which are not amenable for typical trypsin-based analytics. In this paper, we present an overview of different techniques and methods currently used for reducing a sample's complexity and for concentrating low abundant compounds to enable successful peptidome analysis. We focus on techniques which can be employed prior to liquid chromatography coupled to mass spectrometry for peptide detection and identification and indicate their advantages as well as their shortcomings when it comes to the untargeted analysis of native peptides from complex biological matrices.
Peptides are small (low molecular weight, LMW) proteins, built up of amino acids connected by peptide bonds. The shortest peptide is two amino acids long, and with increasing length of the amino acid chain, the name changes from peptide over polypeptide to protein, with a fuzzy border between them. Also the International Union of Pure and Applied Chemistry (IUPAC) has no clear weight or amino acid chain length limit. A somewhat arbitrary but quite generally acknowledged definition puts the boundary between a peptide and a protein at a chain length of 50 amino acids (WIKIPEDIA). In literature, often pragmatic definitions are used, such as “the small proteins typically running off a typical 2D polyacrylamide gel” or “the small proteins with zero or maximally one tryptic cleavage site”, and, therefore, different upper molecular weight limits for peptides can be found from 10 kDa and even beyond [
Body fluids, especially blood serum or plasma, and, in particular cases, (primary) cell culture media, serve as typical and readily available sources for a “peptidomics-driven” discovery of novel candidate disease biomarkers. However, the detection of peptide biomarkers typically present at low concentrations is hampered by the “masking” effect caused by a number of highly abundant proteins [
Illustrative iceberg representation of high dynamic range of proteins found in blood, showing various classes of proteins and peptides (figure composed of literature data [
Liquid chromatography (LC) coupled with tandem mass spectrometry (MS/MS) is the analytical method of choice in today’s proteomics and peptidomics research. Its major benefits include enhanced specificity (particularly over the GC-MS technologies of 25 years ago, which had very limited applicability for peptide separations), its potential for high-throughput analyses, no requirement for expensive analyte-specific reagents, high speed of assay development, and a relatively low cost per assay (the instrument itself, however, not being that cheap) [
In an ideal world, no sample preparation would be required for the analysis of a sample as every manipulation can lead to problems such as loss of sample and, even worse, loss of quantifiability. In particular, for the proteomic/peptidomic analysis of native peptides, which typically do not require protease (trypsin) digestion prior to LC-MS/MS analysis, a so-called “top-down’’ approach seems logical. However, as the complexity of samples still far exceeds the capacity of currently available analytical systems, specific sample preparation remains a crucial part of the analysis in a whole. Peptidomics sample preparation is often time consuming and laborious, involving multiple steps [
Summary of the strengths and weaknesses of analytical tools used in peptidome research as discussed in this review. Abbreviations: IEX: ion exchange; LC: liquid chromatography; MWCO: molecular weight cut-off; OS: organic solvent extraction; PAGE: polyacrylamide gel electrophoresis; RAM: restricted access material; RP: reversed phase; SPE: solid phase extraction; UF: ultrafiltration; SEC: size exclusion chromatography.
Technique | Strengths | Weaknesses |
---|---|---|
Depletion | (i) Removes highly abundant “household” proteins, allowing a “deeper” look into the peptidome | (i) Requires costly antibody columns |
(ii) Each protein to be removed requires a different specific antibody | ||
(iii) Loss of peptides by nonspecific binding | ||
PAGE | (i) Traditional well-established method | (i) Unsuitable for highly complex samples, poor dynamic range |
OFFGel | (i) Effective prefractionation tool | (i) Postconcentration is required low resolution |
UF | (i) Fast | (i) Variable quality and reproducibility of commercial devices |
LC | (i) High resolution | (i) Extensive method development for each specific matrix is required |
SEC | (i) High resolving power | (i) Loading limited by small injection volume |
RAM | (i) Effective removal of HMW compounds | (i) Complicated LC setup |
OS | (i) Easy to operate | (i) Tedious to perform |
Schematic overview of methods and techniques used in proteome and peptidome analyses for sample preparation prior to LC MS/MS.
Conditioned media are cell culture media whose cells have grown in for a certain period of time. The cells “condition” the media by releasing/secreting proteins, cytokines, and other biomolecules. As such, culture supernatants or conditioned media (CM) can be considered yet another (“body”) fluid that can serve as a source for the identification of novel biomarkers, for example, in cancer research [
A traditional well-established technique in proteomics is one- or two-dimensional (1D or 2D) polyacrylamide gel electrophoresis (PAGE). Although more a protein than a typical peptide separation technique, 1D and 2D PAGE are used in “peptidomics” workflows as well, particularly when targeting larger (poly)peptides or proteins. The latter can then be analyzed by MS after extraction from the gel as proteolytic fragments, after in-gel digestion. Denaturing agents, such as SDS, are used to unfold the macromolecules and disrupt noncovalent intra- and intermolecular protein/protein interactions. SDS PAGE is a rather simple technique and, above all, very robust. Its poor resolving power, however, often poses a problem in the analysis of complex mixtures. 1D and particularly 2D PAGE are employed to increase the depth of proteome/peptidome analysis, that is, through fractionation of the sample components and removal of LMW impurities, particularly salts, which interfere with subsequent MS analyses [
2D PAGE is very sensitive predominantly to molecular charges of a protein (by the isoelectric focusing step), making it a very effective method to reveal/separate certain posttranslational modifications like phosphorylations, sulfations, or glycosilations. Limitations are that proteins/peptides with extreme pI values cannot be separated and that the smaller peptides are typically not retained in the second (MW separation) dimension. 2D PAGE followed by in-gel digestion is relatively time consuming and laborious. The dynamic range for detection in 2D PAGE is 102–104, which is less than the protein expression range observed in biological systems. In order to achieve the detection of low abundant proteins, more protein needs to be loaded. Such higher loads, however, often further compromise the technique’s resolution due to spot fusion and comigration [
Important in PAGE is the visualization of the separated proteins, although selected areas of a gel can be processed for MS from unstained preparative gels after comparison/alligning with an analytical reference or master gel.
The most commonly used visible stains are Coomassie brilliant blue (CBB) and silver nitrate staining. CBB staining is easy, MS compatible, and linear over, at least, one order of magnitude, so it is usable for quantification to a limited extent. Silver nitrate is a more sensitive staining method—0.5 ng versus 50 ng for CBB—but the staining procedure is more labor intensive and has a more limited linear range (due to its polychrome results). Although widely regarded as the standard of rigor by which all other “ultrasensitive” staining methods are judged, silver staining remains quite a complex and variable protein-gel-staining methodology, with many dozens of published protocols, all of them requiring several steps. Silver staining quantitation is never simple, due to the complex polychromatic nature of the color development and to considerable differences in response factors between different proteins. Fluorescent dyes are also popular to visualize proteins in gels; however, they are not cheap reagents and require expensive scanners/image analyzers. They exhibit detection sensitivity rivaling that of silver staining, with workflow advantages similar to CBB staining. Fluorescence offers linear quantitation ranges 10–100-fold greater than other colorimetric methods [
Because of the limitations associated with gel-based techniques, recently with respect to detection of the smaller proteins and peptides (see above), attention has gone to off-gel methods for peptide/protein separations, in particular in solution-pI-based peptide separations without the need for carrier ampholytes. It focuses proteins and peptides on an immobilized pH gradient (IPG) gel, which is sealed against a multichamber frame that contains both sample and focusing solutions. The sample is separated by migration through the gel, followed by diffusion into the well adjacent to the section of the IPG strip. It allows for multiple samples to run simultaneously and requires only small sample volume and no prior sample cleanup. Disadvantages are that it has a rather long separation time and requires an insulated cooling system [
In general, off-gel separation has clear advantages over a gel-based approach with respect to focusing and concentrating peptides, but it still requires further optimization to reach the same level of identifications as an RPLC-based separation.
Many different approaches exist to separate proteins based on their biochemical and biophysical properties such as molecular weight, mass, and hydrophobicity. However, these separation methods are not protein selective. Another way to reduce a sample’s complexity is to specifically remove the most abundant protein(s), by doing (immuno-) affinity capturing [
Depletion of highly abundant proteins can be done based on dyes or on antibodies. An example is the removal of albumin from serum, plasma, or cell culture samples. The most used dye for removal of albumin is Cibacron blue (often in combination with protein G for the removal of IgG). This dye however does not only show affinity for albumin but also for NAD, FAD, and ATP binding sites of proteins, which often results in the unwanted removal of proteins of interest [
Immunodepletion based on monoclonal antibodies (mAbs) is generally not preferred as, besides being very expensive, these antibodies typically remove only proteins or protein fragments with the specific targeted epitope, whereas other fragments of the protein remain untouched. Therefore, immunodepletion systems are generally based on polyclonal IgG and/or IgY antibodies, targeting multiple epitopes on the same proteins. Moreover, a mixture of polyclonal antibodies to distinct proteins are nowadays commonly used for removing multiple highly abundant proteins at once [
When using antibodies, one has to consider the number of proteins that has to be depleted from the sample. Depending on the system used, it is possible to remove between 1 and 20 abundant proteins. Roche et al. compared several systems which deplete for different amounts of proteins [
Recently, a creative way of depleting a sample was developed, the so-called hexapeptide library of combinatorial peptide ligands. High abundant proteins are expected to quickly saturate their specific affinity ligands leaving nonbound high abundant proteins to be washed away. In contrast, low and medium abundant proteins and peptides do not saturate their ligands and hence are concentrated on the beads. This technique has the advantage that peptides and proteins are adsorbed under native conditions and thus allow monitoring of their biological activity [
Limited comparative studies are published on the different depletion and enrichment methods. In these few comparisons, most of these methods are found to be complementary to each other. Typically, the methods compared all lead to identification of a number of peptides and proteins, a part of which is generally identified by all methods under investigation and another part which has been identified uniquely in a sample that was treated with one of the methods [
In those cases where larger members of the peptidome need to be addressed, another crucial part of sample pretreatment is alkylation and digestion of the peptides/small proteins. This can be performed prior to or after the previously described techniques, that is, just before LC-MS/MS analysis.
To break the tertiary structure of peptides, disulfide bridges have to be disrupted (reduced) and blocked to prevent reoxidation. Breaking of disulfide bonds is traditionally achieved using reducing agents such as dithiothreitol (DTT) or tris(2-carboxyethyl)phophine hydrochloride (TCEP) [
When larger (poly)peptide members of the peptidome are envisaged, a bottom-up approach, requiring proteolytic digestion prior to mass spectrometric analysis, is sometimes to be considered. Trypsin by far is the most used proteolytic enzyme for the degradation of proteins or peptides. This protease has the advantages of having a high cleavage specificity and being stable in a wide range of conditions. It cleaves C-terminal to arginine or lysine residues (except where the subsequent amino acid in the parent sequence is a proline). Thanks to the biological distribution of these amino acids among all proteins, the resulting peptide masses typically fall within the range required for analysis by mass spectrometry. Some larger peptides or even proteins have been described in literature that contain only 1 tryptic cleavage site, producing a peptide still too large to be readily detected by mass spectrometry. In those cases, a combination of 2 or more proteases and/or alkylating with BrEA i.s.o. IAM, may be used to assist in peptide identifications. Trypsin is very similar to chymotrypsin in primary structure, however chymotrypsin prefers cleaving C-terminal to amino acids with bulky aromatic residues such as phenylalanine, tyrosine, and tryptophan [
Ultrafiltration is a rather easy and widely used technique to fractionate a proteomics sample into aLMW fraction (the “peptidome”) and aHMW fraction (the rest of the “proteome”) by centrifugation [
To disrupt potential protein-protein/peptide interactions, acetonitrile (ACN) is added to the sample before ultrafiltration. ACN this way improves the recovery of LMW peptides [
Ultrafiltration units can also be used as “reactors” for digestion of proteins and chemical modifications. This approach is known as filter-aided sample preparation (FASP) and it can be used to combine the advantages of in-gel and in-solution digestion [
The addition of organic solvents to serum causes high molecular weight (HMW) proteins to precipitate, leaving the LMW protein fraction—including the peptides—in solution. By also adding ion-pairing reagents such as trifluoroacetic acid, peptides, and smaller proteins can be dissociated from high abundant proteins, thereby facilitating their extraction [
As an alternative to organic solvents, ammonium sulfate (AS) can be used for precipitation of proteins. Although AS is a very efficient precipitant, it can cause interface contamination when combining with LC-MS [
Kawashima et al. [
Comparison of the yield of low molecular weight protein/peptide extraction from serum by means of differential solubilization (DS), organic precipitation (OP), and ultrafiltration (UF). All techniques effectively remove the high molecular weight serum protein, whereas recovery of LMW proteins/peptides is highest with DS. Reprinted with permission from Kawashima et al. [
Another analytical tool to separate LMW compounds from HMW compounds is size exclusion chromatography (SEC). This is a widely used technique for the purification and analysis of synthetic and biological polymers based on their size, which is not by definition the same as their molecular weight. It separates polysaccharides, nucleic acids but also proteins and peptides. The material used for SEC consists of porous beads, which either exclude the peptide/protein analytes from the internal space or allow them to enter based on their size. Peptides and smaller proteins, which can enter the beads, will move at a slower rate through the column than bigger proteins which cannot penetrate the beads, thus migrating faster. A disadvantage of this technique is its low resolving power, which can be improved by using it in combination with other separations, such as in multidimensional separation approaches (see below). Other drawbacks are the high elution volumes which cause dilution of the sample, increased costs when having to use multiple columns, and the need for high sample loads [
Reversed phase liquid chromatography (RPLC) separates molecules based on differences in their hydrophobicity. The mobile phase is a water and nonpolar organic solvent mixture, whereas the stationary phase is hydrophobic. Factors influencing the selectivity and resolution of separation include the hydrophobic ligand, the particle size, sample volume, column length, and pH. The most commonly used hydrophobic ligand for peptide analysis is C18, but C4 and C8 are preferred for the larger peptides and proteins. In terms of peptides extraction, the solid phase material can be packed in syringe-shaped cartridges, or even in 96-well plate formats, which allows for high throughput extractions. Another advantage of this technique is its capability to desalt samples, which is desirable prior to mass spectrometric analysis [
Ion exchange chromatography (IEX) separates peptides (and proteins) based on their charge in a specific salt environment and pH of the mobile phase. Two types of IEX exist, namely, cation or anion exchange (respectively CX and AX). IEX has a disadvantage, being the use of salts which makes the eluate incompatible with MS. The use of salts can also lead to irreversible peptide or protein absorption to the resin, resulting in sample loss. The principles of IEX are well understood, and other advantages include the high resolution that can be achieved, high capacity, selectivity, and robust operation.
Usually, simple salt buffers are sufficient and concentrations are used in a defined range, in which the so-called salting-in effect on proteins is observed. This is the range where a protein becomes more soluble with increasing salt concentration. Cation exchangers are negatively charged, and anion exchangers; are positively charged. Above its pI, a protein is negatively charged and binds to an anion exchanger; below its pI, it is positively charged and binds to cation exchangers. The ion exchanger itself behaves like an acid or base, and the disproportionation of the charges depends on the pH. Strong ion exchangers behave like a strong acid or base and do not change the charge within a wide range of pH, whereas weak ion exchangers do. This property can also be exploited to gain selectivity or by applying pH gradients for elution [
Restricted access material (RAM) has been described for use in the separation of large biomolecules and the extraction of LMW analytes. RAM could be considered to be an “upgrade” of normal size exclusion material. The outer surface of the RAM particles is coated with a protective, nonadsorptive hydrophilic packing, while the surface of the pores can be coated with a variety of different affinity matrices [
The most used type of internal selection material is strong cation exchange (SCX) as this type of material is highly suitable for the Online extraction of target peptides from complex biological samples such as plasma [
Because of the high complexity of biological samples, often a single fractionation or separation step is insufficient. Therefore, several techniques can be combined to what is referred to as a multidimensional separation. Combining 2 (different) separation techniques leads to an increased number of peptides measurable, an enlarged overall dynamic range and thus an improved peptidome coverage. A multidimensional approach can be achieved both offline and online. For the offline separation, fractions are collected after the first dimension, which are later reinjected into the second dimension separation. In between both steps, the fractions can be manipulated if necessary. A disadvantage of this method is the potential sample loss when transferring between both dimensions. When doing Online multidimensional separation, the samples are automatically transferred from the first to the second dimension. A drawback of this approach is that both dimensions have to be compatible. Also the solvents should not cause salt precipitation or immediate elution of the compounds in the second dimension. Most often, RPLC is the second dimension for its high speed, desalting capability, and compatibility with mass spectrometry [
It is clear that, besides selecting the best methodology to comprehensively separate peptides from a biological mixture, an essential part of peptidomics sample preparation is the preservation of the integrity of the
Effect on the number of detected (neuro)peptides (from mouse hypothalamus) by postmortem time (time between tissue collection and heat denaturation/stabilization) [
In general, it can be concluded that sample preparation for the purpose of capturing the peptidome from biofluids still has room for improvement, and a single generic methodology is not (yet) available. Many methods exist for the analysis of proteins which can, sometimes with minor adjustments, be used for peptides as well. Ideally, sample preparation/handling should be minimal, but current mass spectrometers are not quite able to handle complex biological matrices. Manufacturers of mass spectrometry instrumentation are continually improving the performance of their systems. For example, over the last decades, the sensitivity of mass spectrometers has tremendously increased, where at the moment they have reached the low attomole level. Also, improvements in mass accuracy and advances in peptide fragmentation techniques permit us now to obtain highly confident identification and even fully de novo sequence novel peptides. Despite these ongoing improvements, it should be noted that at the moment, a peptidomics analysis MS system that can operate without sample preparation is still far way, which is mainly due to the high complexity of the biological matrices. The biggest bottleneck in biofluid research is the high dynamic range at which the proteins/peptides occur. For example, albumin concentration is ~40 mg/mL in serum, whereas the concentration of biologically active compounds such as cytokines is ~1 pg/mL. The removal of abundant plasma/serum proteins by dye- or immuno-based depletion is at moment the most popular strategy to “mine” deeper into proteome/peptidome. However, one should also realize that albumin and other major proteins in blood plasma carry some number of other “adsorbed” peptides and even small proteins on their surface, and, therefore, these will risk to get lost during this depletion step. The precipitation of the abundant protein fraction in human biofluids by organic solvents has probably the longest history among all the methods used for abundant protein removal. This procedure is still one of the most effective methods as it permits fast and, although not always highly reproducible, cheap obtainment of the peptide fraction for their analysis by mass spectrometry.
Several other analytical strategies are at hand that aid in the reduction of the complexity of samples, and some specifically focus on capturing the LMW fraction, that is, the smaller proteins and peptides from the bulk of larger proteins, for example, ultrafiltration as method to split a sample into a proteome and a peptidome fraction. However, one should be aware that several studies have shown dramatic differences in the performance (both in terms of reproducibility, recovery, and separation) of commercially available ultrafiltration devices.
A more preferred analytical tool to separate biomolecules on basis of size is liquid chromatography. Size exclusion chromatography and more specifically restricted access material have high potential in peptidome research, due to their high resolution and selectivity. Reversed phase and ion exchange chromatography are highly suitable for peptide fractionation and separation. These last two chromatography techniques also can be miniaturized, making them even more favorable in term of sensitivity.
In general, as sample preparation is often time consuming and laborious, high-throughput and automated approaches are highly desirable. Whereas up to now, when doing multidimensional LC, the first dimension led to the collection of fractions in an offline setup, online multidimensional LC setups gain popularity. Also the column material used in liquid chromatography is continually being improved, and the possibility of combining separation based on different physicochemical properties, such as RAM, is a big step forward. Even though it usually takes some effort to completely set up and optimize an LC-based method, once done, it usually results in a very robust and reliable technique, which can be fully automated. Also miniaturization is an important aspect of the evolution, as a miniature system (e.g., lab-on-a-chip) will allow for lower amounts of solvents and less complicated equipment to be used, decreasing the cost of an analysis. With the development of new sample pretreatment methods and lower detection limits on mass spectrometers, future peptidome analysis will aid substantially in biological and medical research, for example, by discovering new biomarkers and uncovering novel signaling pathways.
The authors wish to thank the European Marie Curie Training Network (Grant number MCRTN-CT-2006-035854) and The Netherlands Proteomics Centre for their financial support.