A Proteomic Approach for Plasma Biomarker Discovery with iTRAQ Labelling and OFFGEL Fractionation

Human blood plasma contains a plethora of proteins, encompassing not only proteins that have plasma-based functionalities, but also possibly every other form of low concentrated human proteins. As it circulates through the tissues, the plasma picks up proteins that are released from their origin due to physiological events such as tissue remodeling and cell death. Specific disease processes or tumors are often characterized by plasma “signatures,” which may become obvious via changes in the plasma proteome profile, for example, through over expression of proteins. However, the wide dynamic range of proteins present in plasma makes their analysis very challenging, because high-abundance proteins tend to mask those of lower abundance. In the present study, we used a strategy combining iTRAQ as a reagent which improved the peptide ionization and peptide OFFGEL fractionation that has already been shown, in our previous research, to improve the proteome coverage of cellular extracts. Two prefractioning methods were compared: immunodepletion and a bead-based library of combinatorial hexapeptide technology. Our data suggested that both methods were complementary, with regard to the number of identified proteins. iTRAQ labelling, in association with OFFGEL fractionation, allowed more than 300 different proteins to be characterized from 400 μg of plasma proteins.


Introduction
Blood circulates throughout every part of the body, and no other biofluid has the same degree of intimacy with the body. Therefore, it is not surprising that it possesses such a wealth of information concerning the overall pathophysiology of a patient. As an example, alterations in protein abundance can serve to indicate pathological abnormalities: diseases, toxic effects of clinical treatments and so forth.
The choice between plasma and serum has been abundantly documented in the literature [1,2]. When blood is collected, many changes occur in the proteins it contains, due to the presence of proteolytic enzymes (proteases) and other enzymes, which remain active in the blood sample during handling and processing. The HUPO Committee and its research collaborators concluded with the recommendation that plasma is the preferred specimen taken from the blood. The reasons for this are (i) less ex vivo degradation, and (ii) much less variability than in the case of the proteaserich process of clotting. Misek et al. [3] showed with Cy5-, Cy3-, Cy2-labeled serum and plasma on DIGE-2D-PAGE, after extensive fractionation of intact proteins before tryptic digestion, that isoforms of abundant proteins were more often shifted to lower-than-expected MW in serum than in plasma. Tammen et al. [4] reported that 40% of the low-MW peptides detected were serum specific.
Biomarker discovery in plasma is often limited by the availability of sufficient volumes. It is also complicated by the wide dynamic range of the human plasma proteome, which comprises proteins spanning concentrations of more than 11 orders of magnitude, with the top 10 most abundant plasma proteins accounting for approximatively 90% of the total plasma proteins. Potential disease biomarkers are often present in low concentrations, and the dynamic range of the plasma proteins poses a significant analytical challenge to proteomic approaches. A prefractionation method is necessary in the biomarker process. The most common technique is immunodepletion, which has been extensively used for the specific removal of high abundance proteins, based on the action of specific antibodies [5,6]. More recently, saturation protein binding to a random peptide library has been proposed as an alternative method [7,8].
One of the methods used to discover biomarkers is the identification and quantification of proteins, based on an iTRAQ quantitative proteomic approach. iTRAQ is ideally suited for biomarker applications, as it provides both quantification and multiplexing in a single reagent, and has been applied to the analysis of clinical samples such as human cerebrospinal fluid, and disease tissues, and has been used for the in vitro profiling of cells to identify differentially expressed proteins. To the best of our knowledge, there are currently only two published papers where iTRAQ has been used to study human serum and plasma. Hergenroeder et al. [9] employed iTRAQ and electrospray ionization tandem mass spectrometry (ESI-MS/MS) to serum depleted of 12 high abundance proteins, leading to the identification of 160 proteins. Song et al. [10] used iTRAQ protocol and MALDI-MS/MS to identify 105 proteins in human plasma.
We recently demonstrated that iTRAQ labelling and peptide OFFGEL fractionation in a first dimension improved the identification of weakly concentrated proteins from a cellular extract [11]. The aim of the present study was firstly, to use iTRAQ as reagent to improve the MALDI ionization of peptides and secondly to evaluate the performance of our previous strategy for the study of the human plasma proteome, in terms of the number of identified proteins, the presence of high abundance proteins, and the identification of medium and weakly concentrated proteins.

Human Blood Plasma Samples.
A citrated plasma pool, composed of a collection of 10 methylene blue virusinactivated plasma samples, obtained from healthy donors using apheresis, was provided by the French National Public Blood Institution (Etablissement Français du Sang Bourgogne Franche Comté, CHU Le Bocage, Dijon, France). The plasma pool (Internal Quality Control) was loaded into 0.5 ml bar-coded straws, stored at 4 • C for 2 hours, and then transferred into liquid nitrogen. The plasma straws were packed in a dry ice container during transportation to our laboratory and were immediately stored in a −80 • C freezer until they were needed.

Immunoaffinity Depletion of High-Abundance Proteins.
In a four independent experiments, the 14 most highly abundant proteins were removed from the plasma, using antibody-based depletion with a Human 14 Multiple Affinity Removal System, MARS-Hu 14 (Agilent Technologies, Santa Clara, CA, USA). Two different spin cartridges were used to deplete 10 times 10 μL of 0.22 μm-filtered plasma. This process required 2 buffers, A and B (Agilent Technologies). The pH 7.4 phosphate salt-containing buffer A was used for the equilibration, loading and washing steps. Flow-through fractions containing low-abundance proteins were collected and stored at −80 • C until they were ready for analysis. A pH 2.5 urea buffer B was used for elution of the bound, highly abundant proteins from the cartridge. The experiment was conducted at room temperature according to the protocol supplied by the manufacturer.
2.3. Hexapeptide Ligand Library Treatment. Plasma proteins were "equalized" using the ProteoMiner Protein Enrichment Kit (Bio-rad laboratories, Hercules, CA, USA). Four different spin columns were run in parallel, according to the manufacturer's instructions. Each was loaded with 900 μL of 0.22 μmfiltered plasma, for 2 hours at room temperature, and with 100 μL of 1 M sodium citrate, and 20 mM of HEPES, at pH 7.4. No bead agglomeration was observed. The proteins were desorbed using a two-step elution. The first beads were incubated twice with 100 μL of the kit elution reagent (4 M urea, 1% (w/v) CHAPS, 5% (v/v) acetic acid) for 15 minutes. Then, 100 μL of 6 M guanidine-HCl, at pH 6.0, were added twice for 15 minutes. For each column, the four elution fractions were pooled and stored at −80 • C, until they were needed for analysis.

Buffer Exchange and Protein Content Estimation.
Using 2000 MWCO Hydrosart Vivaspin 2 spin concentrators (Sartorius Stedim Biotech, Göttingen, Germany), the prefractionated plasma samples were concentrated and bufferexchanged, by subjecting them to repeated (four times) centrifugation, to an appropriate 0.5 M triethylammonium bicarbonate (TEAB) pH 8.5 buffer (Sigma-Aldrich Corporation, Saint Louis, MO, USA), for downstream analysis. The protein concentrations of whole and prefractionated plasma samples were determined using a FluoroProfile Protein Quantification Kit (Sigma-Aldrich Corporation), with BSA as the standard.

1-Dimensional Gel Electrophoresis (1DGE).
Equal amounts of proteins from crude and prefractionated plasma, obtained by immunoaffinity depletion or hexapeptide ligand library treatment, were diluted in an SDS loading buffer at 2 mg/mL, and heated to 100 • C for 10 minutes. The elution beads from one ProteoMiner column were then washed and directly boiled at 100 • C for 10 minutes in a 100 μL SDS loading buffer. The proteins were then separated onto a home-made 12.5% SDS-PAGE gel (16 cm long and 1.5 mm thick), using a standard Laemmli buffer system in an SE 600 Ruby electrophoresis unit (GE Healthcare, Chalfont Saint Giles, UK). Precision Plus Protein Standards (Bio-rad laboratories) were loaded in the molecular weight marker lane. The gel image was acquired on an Ettan DIGE Imager (GE Healthcare), after total protein fluorescent poststaining with Deep Purple (excitation, 532 nm; emission 610 nm), according to the standard protocol (GE Healthcare). reduced for 1 hour at 60 • C with 5 mM tris-(2-carboxyethyl) phosphine (TCEP) and were cysteine-blocked with 10 mM methyl methanethiosulfonate (MMTS) at room temperature for 10 minutes. The proteins were then digested for 40 hours at 37 • C, by 10 μg of TPCK-treated trypsin, with CaCl 2 (Applied Biosystems, Foster City, CA, USA). Each peptide solution was labelled for 3 hours at room temperature, using an iTRAQ reagent previously reconstituted in 70 μL of ethanol, according to the iTRAQ Reagents Multiplex Kit protocol (Applied Biosystems). The reaction was stopped by adding milliQ water, and the samples labelled, respectively, with 114, 115, 116, and 117 mass-tagged iTRAQ reagents were combined according to the experimental protocol shown in Figure 1.

Peptide OFFGEL-IEF Fractionation.
For pI-based peptide separation, the 3100 OFFGEL Fractionator (Agilent Technologies) was used with a 24-well setup. Prior to electrofocusing, the peptide samples were desalted onto a Sep-Pak C18 cartridge (Waters Corporation, Milford, MA, USA) and were resolubilized in 3.6 mL of 5% (v/v) glycerol and 1% (v/v) IPG buffer, at pH 3-10 (GE Healthcare). The 24 cm-long IPG gel strips (GE Healthcare), with a 3-10 linear pH range, were rehydrated for 15 minutes according to the manufacturer's manual. Then, 150 μL of sample was loaded in each of the 24 wells. Electrofocusing of the peptides was performed at 20 • C until a level of 50 kVh was reached. After focusing, the 24 peptide fractions were withdrawn and the wells were rinsed with 200 μL of a solution of milliQ water/methanol/formic acid (49/50/1). After 15 minutes, each of the rinsing solutions was pooled with its corresponding peptide fraction. All of the fractions were evaporated by centrifugation under vacuum and were maintained at −20 • C. Just prior to nano-LC separation, the fractions were resuspended in 20 μL of milliQ water with 0.1% (v/v) TFA.
2.8. Nano-LC Separation. The peptide fractions were separated on an Ultimate 3000 nano-LC system (Dionex Corporation, Sunnyvalle, CA, USA), using a C18 column (PepMap100, 3 μm, 100 A, 75 μm id ×15 cm, Dionex Corporation) at a flow rate of 300 nL/minute. Buffer A comprised 2% ACN in milliQ water, with 0.05% TFA, and buffer B comprised 80% ACN in milliQ water, with 0.04% TFA. The peptide solutions were first desalted for 3 minutes using buffer A only on the precolumn, and the separation occurred over a period of 70 minutes, with the following gradient: 0 to 20% B in 10 minutes, 20% to 55% B in 55 minutes, and 55% to 100% B in 5 minutes. Chromatograms were recorded at a wavelength of 214 nm. Following a 12-minute run, the peptide fractions were collected for 10 seconds using a Probot microfraction collector (Dionex Corporation), and spotted directly onto a MALDI sample plate (1664 spots per plate, Applied Biosystems). The CHCA matrix (LaserBioLabs, Sophia-Antipolis, France), with a concentration of 2 mg/mL in 70% ACN, in milliQ water with 0.1% TFA, was continuously added to the column effluent via a micro "T" mixing piece at a flow rate of 1.2 μL/min. After screening of all LC-MALDI sample positions in MS-positive reflector mode, using 1500 laser shots, the fragmentation of automatically selected precursors was performed at a collision energy of 1 kV using air as the collision gas (pressure ∼2 ×10 −6 Torr). MS spectra were acquired between m/z 800 and 4000. The parent ion of the Glu-1 fibrinopeptide at m/z 1570.677, diluted in the matrix (3 femtomoles per spot), was used for internal calibration. Up to 12 of the most intense ion signals per spot position, characterised by an S/N > 12, were selected as precursors for MS/MS acquisition. Peptide and protein identifications were performed using ProteinPilot Software v 2.0 (Applied Biosystems) and the Paragon algorithm [12]. Each MS/MS spectrum was searched against the Uniprot/Swissprot database (release 96, September 2008) for Homo Sapiens species, with the fixed modification of methyl methanethiosulfonate-labelled cysteine parameter enabled. Other parameters such as the tryptic cleavage specificity, the precursor ion mass accuracy and the fragment ion mass accuracy, are MALDI 4800 builtin functions of the ProteinPilot software. The ProteinPilot software calculated a confidence percentage (the unused score), which reflects the probability of a hit being a "false positive," meaning that at the 95% confidence level, there is a false positive identification probability of about 5%. While this software automatically accepts all peptides with an identification confidence level > 1%, only proteins having at least one peptide above the 95% confidence level were initially recorded. Low confidence peptides cannot give a positive protein identification by themselves but may support the presence of a protein identified using other peptides with higher confidence. Searches against a concatenated database containing both forward and reversed sequences enabled the false discovery rate to be kept below 1%.

Data Analysis.
In order to analyse the quality of pI fractionation after OFFGEL-IEF and MALDI-MS/MS identification, the experimental pI was calculated for each peptide, using the pI/MW tool of the ExPASy Proteomic Server [13] checking all the deamidation modifications which could influence its value. Then, the average experimental pI of peptides (after filtering for false positive responses) was compared, for each of the 24 fractions, with the theoretical pH values provided by Agilent Technologies for 24 cmlong IPG gel strips with a 3-10 linear pH range. To study the relative abundance of proteins in the plasma, the MS/MS spectra, which enabled protein identification with at least 2 peptides, were counted for each protein [14].

Results and Discussion
Our strategy for the study of the human plasma proteome was based on three-step fractionation. In the first step, the plasma samples were prefractionated using either an immunodepletion method, or a peptide ligand library strategy. The proteins were then digested by trypsin, resulted peptides were iTRAQ-labelled and OFFGEL-fractionated in 24 fractions. Each fraction was then analysed by nano-LC on a C18 column (Figure 1).

Identification of Proteins.
The experimental design for the iTRAQ labeling of proteins from the immunodepleted and bead-treated plasma was the same. The prefractionated plasma samples were concentrated and dissolved in the appropriate iTRAQ buffer using spin concentrators before the steps of reduction, MMTS blocking, digestion and iTRAQ labelling ( Figure 1). After OFFGEL separation of 400 μg of iTRAQ-labelled peptides in 24 fractions, Protein Pilot software leads to the identification of 332 proteins in immunodepleted plasma and 320 proteins from the hexapeptide ligand library treated plasma (Figure 2).
The average experimental pH value of each OFFGEL fraction is indicated by a bar in Figure 3. The theoretical pH values provided by the manufacturer are also shown by a dashed line. The pI value for each identified peptide was calculated using Bjellqvist's algorithm [15], without taking the iTRAQ groups in the N-term position, and/or the lateral lysine chain, into account. Using these data, average pI values with standard deviations were calculated for all of the peptides identified in each fraction (Figure 3). The average experimental pI value deviated from the theoretical pI value by an average error of ±0.90, for both prefractionation strategies. From the immunodepleted plasma, 243 proteins were characterized by at least 2 peptides, and from the plasma treated by the hexapeptide bead library, 228 were associated  with at least 2 peptides, suggesting that both prefractioning methods produced virtually the same number of identified proteins. Among these, 158 were common to both methods ( Figure 2). Nevertheless, in addition to these mutual proteins, 85 proteins (with at least 2 peptides) were identified by immunodepletion technology only, and 70 proteins (with at least 2 peptides) were identified by the equalizer strategy only, suggesting that these strategies are complementary. The merging of both sets of data allowed a total of 313 proteins with at least 2 peptides to be identified. A previous study conducted by us with 400 μg of immunodepleted plasma and treated in the same conditions, except iTRAQ labelling, showed 115 identified proteins (Supplementary Material available online at doi:10.1155/2010/927917) in agreement with Song et al. results [10]. This result demonstrated the efficiency of iTRAQ labelling for the peptide ionization and the protein identification according our previous study [11].

Evaluation of the First Prefractioning
Step. The human plasma was prefractionated using two different methods: an immunodepletion strategy on a human MARS-14, which depleted the 14 most abundant plasma proteins, and a peptide ligand library technology with a ProteoMiner column, which should "equalize" the plasma proteins. The reproducibility of these approaches was evaluated by four independent experiments (Figure 1). The total protein content was used to investigate reproducibility and protein recovery ( Table 1). The total protein concentration of untreated plasma was 63 mg/mL. The mean recovery rate of the eluted bead-treated plasma was 2.4% (in agreement with previous results [8]), compared with 5.8% following the depletion process using the MARS-14 column. Both methodologies showed a reproducibility of around 15%, in the determination of total protein content.
For each approach, reproducibility of biological experiment and iTRAQ labelling were evaluated with the coefficient variation calculation from ProteinPilot results. An average of ±18% variation across the 4 experiments with the human MARS-14 strategy and an average of ±11% with the ProteoMiner approach were exhibited. Separation of native plasma, immunodepleted, and bead-treated plasma samples by SDS-PAGE revealed a significant reduction in the dynamic range of protein concentration in the treated fractions, when compared with native plasma (Figure 4) but did not identify one prefractionation method as being superior to the other. Comparison of the 20 most abundant proteins evaluated by the MS/MS spectra counting technique indicated that both prefractioning methods were equivalent, in terms of the estimated protein concentrations ( Figure 5). With ProteoMiner treatment, fibrinogen alpha and beta chains were the most commonly found proteins, thus suggesting that this technique could be more suitable for serum samples. of proteins, with predicted extracellular locations (33%) (Figure 6(b)) was present in our plasma proteome map. Functional classification also revealed that most of the proteins are involved in "binding" and enzyme activities ( Figure 6(a)). Among these proteins, we identified medium concentrated proteins with concentrations ranging around 30 ng/mL, such as P-selectin, cadherin 5 [16,17] (Table 2). We successfully detected the low-concentrated proteins Hepatocyte growth factor activator, insulin like growth factor binding-protein2 and Sex hormone-binding globulin in the both experiments with ProteinPilot unused score >2 (proteins identified with at least 2 peptides; 95% confidence). The literature data showed that the concentration of these proteins was in the range of 8-18 ng/mL [16,18,19]. Compared with the concentration of the most abundant  plasma protein (HSA) which is around 50 mg·mL, we can conclude that the dynamic range to detect low-abundant plasma proteins could be extended to 10 6 -10 7 .

Conclusions
The number of proteins which could be identified in 400 μg of plasma proteins was markedly increased in this study, when compared to similar samples studied without iTRAQ labelling, or with iTRAQ labelling, but without OFFGEL fractionation [10], suggesting that our strategy improved the proteome coverage of human plasma. The limited number of individual proteins identified in this study, despite prefractionation, highlights the challenge of plasma-based biomarker discovery. From our experience, similar iTRAQ analyses of cellular extracts are able to identify Table 2: Some of the weakly abundant plasma proteins identified with two or more peptides after immunoaffinity depletion of highabundant proteins (IM) or hexapeptide ligand library treatment (PM). Associated concentrations were found in the literature in the midrange from 5 to 50 000 ng/mL serum or plasma.  [11]. Clearly, the presence of many highly abundant proteins in human plasma and therefore, after trypsin digestion, the presence of many highly concentrated peptides prevent a good MALDI ionization of weak-concentrated peptides and therefore limit the depth of analysis. Theses results argue in favour of the search for new strategies for the removal of abundant plasma proteins [27], or for the enrichment of less abundant proteins, in order to facilitate the efficient discovery of biomarkers.