Biomarker Amplification by Serum Carrier Protein Binding

Mass spectroscopic analysis of the low molecular mass (LMM) range of the serum/plasma proteome is a rapidly emerging frontier for biomarker discovery. This study examined the proportion of LMM biomarkers, which are bound to circulating carrier proteins. Mass spectroscopic analysis of human serum following molecular mass fractionation, demonstrated that the majority of LMM biomarkers exist bound to carrier proteins. Moreover, the pattern of LMM biomarkers bound specifically to albumin is distinct from those bound to non-albumin carriers. Prominent SELDI-TOF ionic species (m/z 6631.7043) identified to correlate with the presence of ovarian cancer were amplified by albumin capture. Several insights emerged: a) Accumulation of LMM biomarkers on circulating carrier proteins greatly amplifies the total serum/plasma concentration of the measurable biomarker, b) The total serum/plasma biomarker concentration is largely determined by the carrier protein clearance rate, not the unbound biomarker clearance rate itself, and c) Examination of the LMM species bound to a specific carrier protein may contain important diagnostic information. These findings shift the focus of biomarker detection to the carrier protein and its biomarker content.

Traditionally, mass spectrometry analysis of complex protein mixtures involves an upfront chromatographic separation step [4][5][6], followed by an enzymatic fragmentation of the separated proteins for direct MS-MS identification [14][15][16]. In contrast to such traditional approaches, recent applications of mass spectrometry biomarker analysis may have been successful because no enzymatic treatment was conducted and have utilized the native undigested serum proteome as a launch point for biomarker discovery [9][10][11][12][13]. Small molecular mass biomarkers may in fact be created by specific disease related enzymatic cleav-age or posttranslational modification of larger proteins. Enzymatic treatment prior to analysis may destroy or mask this information content by cleavage of disease biomarkers and by creating large quantities of enzymatic fragments from high abundance proteins.
Several groups of investigators have reported the discovery of low molecular mass diagnostic biomarkers using direct mass spectrometry analysis of nonenzymatically treated serum or plasma [8][9][10][11][12][13]. Since the vast majority of the proteins in the test sample are above the range of accurate detection (> 10 kDa) by the mass spectrometer, the low molecular mass biomarkers that emerge from the analysis must be derived from two possible sources: a) free in solution or b) bound to larger carrier molecules. SELDI-TOF and MALDI-TOF mass spectrometry analysis involves the laser-induced ionization of dried mixtures of molecules [17,18] adherent to a surface. The generated ions can represent biomarkers originally existing in the free unbound phase or existing in a bound state with larger proteins. Since the low molecular mass serum/plasma proteome is largely uncharacterized, there has been no previous analytical or experimental estimate of the relative proportion of small molecular mass biomarkers that exist in the free versus the bound phase.
Under the assumption that the low molecular mass biomarkers contain important diagnostic information and that protein biomarkers useful in disease detection are of very low abundance, the search for biomarkers usually begins with a separation step to remove the abundant high molecular mass "contaminating" proteins such as albumin, thyroglobulin, and immunoglobulins so that the analysis can focus on the lower molecular mass region [14][15][16]. The purpose of the present study is to examine the proportion of the low molecular mass species bound to the high molecular mass fraction of the serum/plasma proteome. From a physiologic perspective, free phase low molecular mass molecules (< 30 kDa) should be rapidly cleared through the kidney [19][20]. Consequently such rapid physiologic excretion may significantly reduce the concentration of free phase low molecular mass species to a level below detection. In the face of the vast excess of high molecular mass serum proteins, it may be likely that low abundance and low mass species will tend to bind large carrier proteins. The abundant high molecular mass carrier proteins exist above the cut-off for kidney clearance [19,20], and hence possess a half-life that is many orders of magnitude larger than small molecules. Circulating carrier proteins may thus become the reser-voir for the accumulation and amplification of bound low mass biomarkers. Non-covalent association with albumin has been shown to extend the half-life of shortlived proteins introduced into the circulation [21][22][23]. The fact that many investigators now are employing and/or developing methods by which the higher abundance proteins above 30 kDa are specifically subtracted from native serum/plasma prior to analysis [4][5][6] may dramatically diminish the chances of finding the important low abundance and low molecular mass disease biomarkers.
We examined the proportion of low molecular mass species detectable by SELDI (surface enhanced laser desorption and ionization) that are associated with the higher molecular mass serum proteome. Human serum was fractionated into high molecular mass and low molecular mass native fractions. Each fraction was assayed by SELDI to assess whether the preponderance of low molecular mass ions is found in the low or the high molecular mass fraction. We further examined the subpopulation of molecular species bound to albumin compared to the total carrier protein fraction. We explored whether SELDI-TOF identified biomarkers correlating with presence of ovarian cancer were associated with high molecular mass carrier proteins. Finally, we explored the theoretical implications of biomarker amplification due to carrier protein binding. The overall aim was to evaluate the use of carrier proteins as an affinity capture means for disease relevant biomarkers.

Serum samples
Serum samples were derived from the ovarian cancer clinical study set of the National Ovarian Cancer Early Detection Program, Northwestern University. The full characteristics of this study set have been described previously [9], and are posted in detail on http://clinicalproteomics.steem.com. Bioinformatic serum proteomic pattern analysis was conducted as described previously [9]. Further details are provided on http://clinicalproteomics.steem.com.

Mass spectrometry
Surface Enhanced Laser Desorption and Ionization (SELDI) was conducted using a PBSII (Ciphergen Systems) as described previously [17,18]. Human serum was collected and anonymized as previously reported [17,18]. Analysis was conducted on a WCX2 (weak cationic exchange) chip. The serum was fractionated into molecular mass classes under native conditions.

Mass fractionation
Thirty microliters of unfractionated human native serum was introduced into a Sephadex G-25 or a Sephadex G-50 molecular sieve spin column according to the manufacturer's instructions. The column was centrifuged at 3000 x g for three minutes, and approximately 30 microliters of eluate containing the high molecular mass fraction was collected. The eluate was treated with 50% acetonitrile (w/w in water) to dissociate bound molecules for 30 minutes and was transferred to the inlet of a molecular filtration microcolumn. (Microcon YM-30 Millipore Centrifugal Filter Device) The column was centrifuged at 1000 x g. The eluate containing the low molecular weight fraction was collected. All fractions at each stage were sampled and one microliter was analyzed by SELDI.

Albumin separation
Segregation of albumin and its low molecular mass binding constituents was conducted using the Montage Albumin Deplete Kit. 100 µL of human serum was diluted one to one with Equilibration Buffer provided with the kit for a final volume of 200 µL, and vortexed. The column was rehydrated twice with 400 µL of Equilibration Buffer and centrifuged through the column insert for 2 minutes at 2,000 rpm. 200 µL of diluted serum was introduced into the rehydrated albumin column and centrifuged for 2 minutes at 2,000 rpm. The eluate from the column contained the serum without albumin. The bound fraction contained the albumin and the low molecular weight species bound to the albumin. We then added 400 µL EAM solution composed of 50% acetonitrile and 0.1% TFA to the column to strip the column and dissociate albumin from its bound species. After 30 minutes, EAM solution was centrifuged through the column at 2,000 rpm for 3 minutes. The eluate contained the dissociated albumin and low molecular weight species that bind to albumin.
Analysis of the proteins bound to the column using ion trap mass spectrometry was performed in line with an LCQ Classic MS (ThermoFinnigan, San Jose, CA) with a modified nanospray source. Dynamic exclusion of the three most abundant peptide hits from a full MS scan were selected for MS/MS analysis by collision induced dissociation with normalized collision energy of 35% and an activation time of 30 ms. Ion spray voltage was 2.00 kV with a capillary voltage of 26.20 V and a capillary temperature of 160 • C. Results for MS/MS scans were searched and compared with theoretical spectra in the Sequest Browser database specified for human proteins.

SELDI/TOF
WCX2 protein arrays were processed in a bioprocessor (Ciphergen Biosystems, Inc). 100 µl of 10 mM HCl was applied to the protein arrays in the bioprocessor and allowed to incubate for 5 minutes. The HCl wash was aspirated and discarded and 100 µl of H 2 O was applied and allowed to incubate for one minute. The H 2 O was aspirated and discarded, then reapplied for another minute. 100 µl of 10 mM ammonium acetate with 0.1% TritonX was applied to the surface and allowed to incubate for 5 minutes. The ammonium acetate was aspirated and discarded. A second application of ammonium acetate was applied and allowed to incubate for 5 minutes. The chip surfaces were then dried using a vacuum to remove any excess amount of liquid. Five microliters of raw sera, or molecular mass fraction, or eluate, was then applied to each chip surface and allowed to incubate for 55 minutes. Each protein chip was washed six times with 150 µl of PBS and H 2 O and then vacuum dried. Cross contamination was eliminated between spots by using a bioprocessor gasket. The gasket was removed and 1.0 µl of a saturated solution of the Energy Absorbing Molecule cinnamic acid (25% saturation) in 50% (v/v) acetonitrile, 0.5% trifluoroacetic acid was applied to each spot on the protein array twice allowing the solution to dry between applications.

Mathematical modeling
The kinetics of biomarker production, carrier protein(s) binding, and clearance, was modeled as a deterministic compartmental model with first order kinetics.

Association of LMW species with HMW carrier proteins
Analysis of native human serum fractionated into high and low molecular mass fractions revealed that the vast majority of low molecular mass serum / plasma biomarkers detectable as MS ions, exist bound to large carrier proteins. SELDI analysis of native serum frac-tions of high and low molecular mass, shown in Fig. 1, demonstrate that virtually all of the detectable ions are derived from molecular species bound to large carrier proteins. In fact, removal of the high molecular mass proteins under native conditions ( Fig. 1(B)), a common method used for biomarker discovery [14][15][16], removes a significant proportion of the ions generated by SELDI-TOF. Comparing the spectra of Figs 1(A) to 1(C) indicates that the majority of ions generated from unfractionated serum are derived from species associated with larger carrier proteins. Figure 1(D), displays the ion spectra of species previously bound, and then dissociated and separated from the higher molecular mass fraction. The intensity and number of many ion species is augmented comparing Figs 1(A), and 1(C).

Populations of LMW species associated
specifically with albumin Figure 1(E) displays the ions associated with the nonalbumin carrier proteins, and Figure 1F displays the ions generated from species bound only to albumin. A significant proportion of the ions in the spectra appear to be derived from species associated with albumin, compared to non-albumin carrier proteins.
We verified through microcapillary LC MS/MS that our albumin bound fraction acquired through stripping the Montage Albumin Deplete Column was entirely albumin and its bound low molecular mass species. Since there was no indication of a significant proportion of other high molecular mass proteins bound to the albumin specific column, we can assume that the low molecular mass species detected were derived from a specific association with albumin, or at least aggregated with and co-separated with albumin. Furthermore, we have positively identified hundreds of low molecular mass species after dissociation from their albumin carrier. Additional studies which will be submitted for publication elsewhere involve characterizing the entire repertoire of low molecular mass species bound to individual serum carrier proteins by LC MS/MS.
We next addressed the question as to whether the ions generated from the species bound to albumin contained disease biomarker information. SELDI-TOF ion patterns, generated on the WCX2 chips, correlating with ovarian cancer were identified by methods described previously [9] (http://clinicalproteomics.steem.com). Two clinical sera data sets were employed and all spectra are provided on the website for downloading as follows: Dataset 8-7-02: Ovarian Dataset 8-7-02.zip, Ovarian Sample Info 8-7-02.xls. The sample set included 91 unaffected controls and 162 ovarian cancers. The following selection of ion mass/charge (m/z) values generated a pattern that was 100% predictive in the training and blinded testing -2760. 6685 Randomly selected representative serum samples from this study set were analyzed by MALDI-TOF comparing the spectra generated by unfractionated sera to the spectra generated only from the species bound to albumin. As demonstrated in Fig. 2 the spectra generated from the species bound to albumin is complex and exhibits a number of differences between the cancer and the unaffected ("normal") cases shown in the example. Comparing the peak intensities between the unfractionated serum (containing all the carrier proteins with their associated or bound species), and the albumin-bound fraction (Fig. 2) indicates that a significant proportion of putative disease biomarkers may be associated with albumin. Figure 3 is an example of ion 6631.7043, a member of the ion pattern 100% correlated with ovarian cancer in this clinical study set. Matched for dilution and amplitude, the predicate ion is highly associated with albumin, and the ionization intensity is augmented in the albumin bound fraction. This demonstrates the albumin binding selectivity of a specific SELDI-TOF ion associated with ovarian cancer. Figure 4 displays the ion spectra for a pooled ovarian cancer serum sample in which the ion species bound only to albumin are compared for different amounts of albumin captured on the column. The captured albumin with its associated species was denatured and its binding partners were dissociated. When a higher number of albumin molecules were stripped of their associated species, the amplitude and complexity of the LMW species, including those in the region of putative ovarian cancer biomarkers, were augmented. These data indicate that albumin capture is a feasible method for biomarker enrichment.

Carrier protein concentration, C r (t):
Dependency on clearance rate At any point in time, the total concentration of the biomarker is dependent upon the biomarker production rate, the biomarker clearance / excretion rate, the binding of the biomarker to a circulating carrier protein, and the clearance / excretion rate of the carrier protein.
We can view the blood intravascular space as a single compartment with volume V . We define the concentration of the carrier protein r as Cr, where the rate of carrier protein production is k in,r , and the rate of its elimination or removal is k out,r .
Then the change in the carrier concentration can be expressed as: Using the LaPlace Transform, the concentration of the carrier protein, at time t is: The initial conditions are C r (t) = 0 at t = 0.

Amplification of biomarker concentration in the presence of the carrier
We assume that a biomarker is continuously produced or shed from the tissue source over time. As shown by the experimental data, biomarker molecules can accumulate over time in a carrier-bound form. At steady state, the total concentration of a biomarker measured in a blood sample can therefore become elevated due to its association with the carrier protein. The level of amplification (A) of the biomarker concentration at steady state, due to the presence of the carrier protein can be defined as the following ratio where C B is the concentration of the biomarker.

A =
(3) C B in the presence of carrier protein(t) C B in the absence of carrier protein(t) .
Plasma biomarkers reflecting a physiologic or disease state of perfused tissues are expected to exist at concentrations many orders of magnitude below the concentration of large carrier proteins such as albumin and immunoglobulins. The experimental findings are logical because, even with a low affinity for the carrier protein, the majority of the biomarker molecules will tend to be associated with the vast excess of circulating carrier protein. Consequently, it is also logical that the biomarker will take on the clearance rate of the carrier protein it is associated with.
The concentration of the biomarker C B (t) as a function of time can therefore be described as the balance between the biomarker input production rate k in,b distributed in the blood volume V , and the loss or clearance of the biomarker bound to the carrier protein (k out,br C br (t)). If k in,b is assumed to be a simple constant production rate, a linear function of time, similar to the assumption for the carrier protein (Eq. (1)) assuming first order kinetics, where the clearance rate is a constant proportion of the carrier bound biomarker, k out,br C br (t), where "br" refers to bound biomarker, then Initial conditions C B (t) = 0 at t = 0. At t → ∞ or steady state, we have Because the biomarker bound to the carrier protein acquires the clearance rate of the carrier protein, k out,br = k out,r , the total steady state concentration of biomarker in the plasma becomes a simple function of the biomarker production rate and the clearance, excretion rate of the carrier protein.
The results of this analysis reveal that the final total concentration of the biomarker measurable in a blood  sample is inversely proportional to the clearance rate of the carrier protein to which the biomarker is bound. Table 1 is a series of computed solutions to Eq. (6) for a range of hypothetical biomarker production rates, and for a series of different named carrier proteins. The clearance rates for serum carrier proteins listed in Table 1 are known [21]. The clearance and excretion rate for free biomarkers was chosen to span the known range for small molecules [19][20][21]27]. For a carrier protein such as albumin, with a long half-life, the resulting amplification (Table 1) can be several orders of magnitude. Carrier protein amplification thus becomes a major factor determining whether a low abundance biomarker can reach a threshold of concentration that Table 1 Theoretical prediction based on Eq. (6) of measured total biomarker concentration for selected high abundance serum carrier proteins, as a function of biomarker production rate, and carrier protein half-life. The experimental results indicated that the majority of the biomarker species exists in a state of association with carrier proteins, whereby the clearance rate of the biomarker takes on the clearance rate of the carrier protein is above the lower limits of detection.

Discussion
A growing body of scientific studies supports the importance of the low molecular mass region of the serum proteome as an uncharted resource for biomarker discovery [9,28]. The experimental data of the present study supports the concept that the vast majority of small mass ions detected by mass spectrometry of native human serum exist in association with circulating carrier proteins of higher molecular weight. This conclusion has several important implications for biomarker physiology and biomarker measurement technology.
Experimental data presented in Figs 1-4 reveal that the majority of ions generated by SELDI-TOF analysis are found to be associated with carrier proteins, rather than free in solution phase (1B versus 1C-F). Moreover, as shown in Figs 2 and 3, ion species altered in disease study sets may be those specifically captured on a single carrier protein. In the example, the carrier protein is albumin. In the past, extensive effort has been placed on separating and discarding the high abundance large carrier proteins in the native plasma so that the remaining low abundance, diseaserelated markers could be discovered. The present results demonstrate that the search for biomarkers must be directed to those molecules bound to the carrier proteins. Removal of high abundance serum/plasma proteins prior to proteomic analysis should be conducted after, not before, dissociation from binding partners. This separation approach has been attempted for 2-D gel analysis [4,5]. As shown in Fig. 4, albumin capture can be used as a means to enrich for disease relevant biomarkers specifically associated with albumin. This now provides a novel method to harvest the necessary quantities of biomarker species required for sequencing and identification.
The first implication from this study is that the concentration of a biomarker measured in serum or plasma is directly related to the clearance rate or half life of the carrier protein, not the biomarker clearance rate itself. As shown in Eq. (6), the concentration of the biomarker is a function of the ratio between the biomarker production rate from the tissue and the clearance rate of the carrier protein. This means that carrier protein binding amplifies the total biomarker concentration levels measured in serum or plasma. Amplification occurs because the carrier protein acts as a reservoir to accumulate the biomarker over time, as the tissue is continuously producing the biomarker. Thus a biomarker produced by a small volume of tissue such as the ovary [9,10], prostate [8,12,13], or breast [11], at a low concentration (e.g. one femtomole per day) can accumulate to a concentration of one picomole in the serum because it binds with a carrier protein with a much longer half-life. In this example the existence of the carrier protein can raise the concentration of the biomarker to a range detectable by conventional assay technology [29]. Without the carrier protein, the free biomarker would be rapidly cleared by the kidney and would therefore reside at a steady state concentration many fold below the detection limits of assay technology.
The impact of this conclusion extends beyond current mass spectrometry detection technology. Small biomarkers are commonly not the province of two-site sandwich immunoassays [30,31]. This is because it is difficult to develop two antibody-binding sites on the same small molecule. In contrast, if the first half of the immunoassay sandwich was the carrier protein and the second half was the small biomarker, a sandwich immunoassay could be achieved.
It is logical that the biomarker clearance rate becomes the carrier protein clearance rate because the carrier protein, even if it has low affinity for the biomarker, is in vast excess. This means that if we know the clearance rate of a given carrier protein, and we know the serum/plasma concentration of the biomarker bound to that carrier protein; we can estimate a lower limit for the continuous production rate by the tissue (Eq. (6)). Thus, if the concentration of a generic, bound biomarker α is 3.8 ng/mL (42.22 fmol/mL, where the biomarker α with its carrier protein have a combined molecular weight of 90 kDa) and the carrier protein half-life is 2.43 days, then the production rate of α from the tissue is at least 45,000 femtomoles per day [21,25,27,32,33]. If α is produced by a one cubic centimeter tumor composed of 10 9 cells, then each cell would produce approximately 16,000 molecules per day. This approximation is consistent with previous experimental findings [32].
These data indicate that the low molecular mass proteome, existing within the range detectable by MALDI-TOF, exists predominately in the bound phase. We propose that technologies that focus on efficient capture of the carrier proteins and specific elution of the low molecular weight biomarkers will yield the greatest amount of diagnostic information. The bound biomarkers may exist in concentrations ten to 500 times greater compared to their free counterparts. Since the carrier proteins exist in vast excess compared to the biomarkers, it is unlikely that the carrier proteins will become saturated with bound biomarkers. Moreover, based on its unique affinity topology [34], each carrier protein may have its own constellation of bound biomarkers. Indeed, the distribution of biomarkers among specific plasma/ serum carrier proteins may have important diagnostic information. Finally, these findings lead to the concept of artificial carrier molecules designed to harvest specific populations of biomarkers associated with target organs or diseases.