Challenges for Biomarker Discovery in Body Fluids Using SELDI-TOF-MS

Protein profiling using SELDI-TOF-MS has gained over the past few years an increasing interest in the field of biomarker discovery. The technology presents great potential if some parameters, such as sample handling, SELDI settings, and data analysis, are strictly controlled. Practical considerations to set up a robust and sensitive strategy for biomarker discovery are presented. This paper also reviews biological fluids generally available including a description of their peculiar properties and the preanalytical challenges inherent to sample collection and storage. Finally, some new insights for biomarker identification and validation challenges are provided.


Introduction
The objective of biomarker discovery is to identify specific protein markers susceptible to improve early diagnosis survey therapeutic outcomes and facilitate the development of novel drug candidates [1,2]. The methodology relies on differential protein expression profiling. The fundamental approach is based on the assumption that the pathology of concern will affect some physiological processes causing changes in the protein expression levels. Proteins generating similar signals in both sample groups are ignored while significantly up-and downregulated proteins become potential biomarkers. Differential expression profiling requires both a sensitive technology to discern any tiny differences and a high-throughput system in order to process large series of samples required to reach statistical significance. Protein differential display techniques such as two-dimensional gel electrophoresis (2-DE), one-or two-dimensional liquid chromatographic (LC-MS), or surface-enhanced laser desorption/ionization time of flight mass spectrometry (SELDI-TOF-MS) are regarded as the most powerful tools for establishing fingerprint profiles [3][4][5][6].
Many reports regarding the application of the SELDI-TOF-MS technology have been published since its introduction in 1993 [7] and its first use for disease detection [8]. One of the key features of SELDI-TOF-MS is its ability to provide rapid protein expression profiles from a variety of biological samples with minimal requirements for purification and separation of proteins prior to mass spectrometry. SELDI-TOF-MS profiling studies revealed that biological fluids contain many proteins with low molecular weight (<15 kDa) not resolved on conventional 2D gels [6,9].
As can be seen in Figure 1, the SELDI technique consists in surface arrays involving various chromatographic models based on both classic chemistries (normal phase, hydrophobic, cation-and anion-exchange surfaces) and specifically affinity-coated surfaces (immobilized metal affinity capture : IMAC). After the binding phase of the sample to these surfaces, the unbound proteins are washed out while retained molecules are overlaid with an energy-absorbing matrix. In the final step, mass spectra are recorded using a laser for the ionization and a TOF mass spectrometer for its resolving power.  Recent interest in the field has yielded a large number of candidate biomarkers in various diseases . However, the small size and poor design of some studies drove validation of these biomarkers quite challenging [36][37][38][39][40][41].
In the context of clinical proteomic using SELDI-TOF-MS, many recent reviews discussed newly identified disease biomarkers [13,21,22,24,27,30,35,[42][43][44]. The present review focuses on technical challenges encountered with the SELDI-TOF-MS technology taking into account new insights coming from the last three years. Critical steps that should be undertaken to avoid any bias, to maximize reproducibility and detection sensitivity, with the final aim to find relevant, specific, and robust biomarkers are addressed [45,46]. For prospective studies, current knowledge on the different biological fluid sources available for SELDI-TOF-MS experiments is described presenting their respective advantages and limitations.
In the early phase of biomarker discovery, the clinical question addressed has to be defined in the disease(s) context collecting adequate control samples. Indeed, it can be criticized that in many published studies, patients were compared to healthy subjects rather than to patients presenting similar diseases or clinical signs.
Experimental workflow and technologies have to be selected with great care. The avoidance of bias is not trivial and must be addressed throughout the whole study, from its design to the data analysis and interpretation (cf. Figure 2). Current proteomics and genomics technologies are extremely sensitive and can detect very small changes in expression levels. Some of these changes may arise from biological differences related to disease or pharmacological treatment. They could also result from the heterogeneity of the patient panels tested across multiples sites, the inherent biological complexity, and the diversity of sample  types. Small differences in sample collection, processing, and analytical techniques could have some impact on the outcomes of the study. As a consequence, clinical data may be site-, study-, population-, or sample-dependent, without any actual clinical relevance [53,62,[64][65][66]. The key factor for maximizing reproducibility in biomarker research is to identify and minimize all potential sources of preanalytical and analytical bias [53,55] (Table 1). Adherence to strict guidelines and SOPs is critical to reach the highest operating standards for data quality and reproducibility [37,38,47,50,51,53,55,62,63,67]. SOPs also facilitate the validation of biomarkers by other groups using different sets of samples.

Sample Handling and Preparation
Besides the instrumentation and the methodologies related to chromatography-mass spectrometry analysis, the nature, quality, and number of clinical samples to process are key elements to be considered for any proteomic approaches.

Selection of Body Fluid.
In order to provide positive answer to any precise clinical question, the investigator has to make a choice among the most relevant biological target samples (body fluid, tissue, etc.). Many criteria must be considered at this level, that is, availability, easiness of collection, stability, composition, proximity with disease location, patient discomfort, ethics, and so forth.

Sample-handling procedures
Collection protocols (initial processing, procedure, timing, type of anticoagulant, type of tubes, number of sites, etc.) Storage procedures (time, aliquoting, storage materials, temperature, freeze-thaw cycles, etc.)

Experimental protocols
Array types Sample pH and dilution factor Quantity of sample loading and position on arrays Sample binding, washing and drying procedures Matrix addition (type and method) Instruments settings Number of instruments, locations Environmental factors (temperature, humidity percentage)

Data analysis methods
Spectrum processing (baseline subtraction, normalization, alignment, noise reduction, etc.) Peak labelling Feature selection, statistical analysis Classification approaches fluids are presented in Table 2 as well as formulated recommendations. When reviewing a series of studies [47-49, 51, 54, 57, 58, 60, 61, 67, 68, 73, 81-83, 85] general guidelines can be forwarded. Optimal serum clotting arises after 60 minutes at room temperature. After clot formation, samples can be transported or stored on wet ice for 3 hours before centrifugation. Aliquots must be prepared and stored at −80 • C. For plasma collection, anticoagulant EDTA is preferred and its processing should be realized as soon as possible after sampling, ideally within the first hour. Although storage at low temperature promotes peptides and proteins stability, one should not recommend storing plasma samples at 4 • C due to the cold activation of platelets. Prior any freezing, plasma can be depleted in platelets by using a filtration step; aliquots are then freezed at −80 • C.
According to the "HUPO PPP Specimens Committee" recommendations, plasma appears preferable to serum because it contains less peptides of degradation and consequently presents less variability [57,81,86]. In order to avoid the presence of platelet related peptides, the authors also recommend to use platelet-poor plasma obtained by centrifugation followed by a filtration step. However, the choice of serum could be justified when studying diseases related to coagulation abnormalities. Furthermore, it is often more available in sample banks for retrospective studies.
A controversial parameter is the addition of protease inhibitors (PIs) to the samples. Some authors found that the addition of a PI cocktail induces significant differences in protein profiles when compared to crude samples [58,83]. Whenever directly introduced during phlebotomy, PI allows (ii) proteins and peptides that "survive" to the clotting procedure exhibit a stability that can be exploited in routine clinical applications.
(ii) biomarker with poor stability during coagulation process will not be detected in serum, (ii) keep sample during 1 hour at RT to allow clotting process before centrifugation, (iii) possible influence of the disease on coagulation process.
(iii) preserve on ice after clotting. Aliquoting and freezing (−80 • C) cannot be done immediately.

Plasma
(i) more rapidly processed than serum (interesting for emergency diagnosis), (iv) SELDI-TOF spectra less rich in peaks number and intensity than serum.
(iv) centrifuge, aliquot and freeze (−80 • C) as soon as possible. If not possible, keep at RT to avoid cold platelet activation.

Dry blood
(i) medical staff not needed for collection, (i) elution step to recover sample from filter paper.
(i) keep dry specimens at RT for 3-4 hours in horizontal position, (ii) low blood volume necessary, (ii) store at −20 • C.
(iii) easy storage and transport.

Saliva
(i) easy and noninvasive sampling, (i) low volume collected, (i) always collect with the same method (stimulated or not) and at the same moment of the day, (ii) medical staff not needed for collection, (ii) presence of many proteases and unspecific materials such as food residues or microorganisms, (ii) centrifuge to remove insoluble material, aliquot and freeze at −80 • C.
(iii) level of certain plasma proteins are not reflected in saliva. (iii) centrifuge, aliquot and freeze at −80 • C, (iv) normalization with creatinine content. 6 Journal of Biomedicine and Biotechnology fluid stabilization for at least 2 hours at room temperature by reducing proteolysis damages. However, PI presents some additional drawbacks such as the presence of highly concentrated components in the cocktail which can compete later on for protein array interactions.

Urine
Another important factor to decrease the risk of variation, bias, and errors is the communication between researchers and medical staff. One generally considers that 70% of the errors are due to human intervention (mostly due to communication problems) while only 30% appear instrumental related errors [87]. The mode of specimen collection (veni-puncture or arterial puncture), the site of collection, the position of the patient, or the tourniquet technique can influence the concentration of certain blood constituents [58]. Hemolysis also causes significant changes in blood proteome specimens [67]. It is generally advised to discard those kinds of samples, but when the disease studied involves spontaneous hemolysis, this cannot reasonably be done.
Less commonly used, filter papers were also described to collect blood [68]. This mode of collection has the advantage that only few drops of blood are needed (particularly interesting for neonatal and repeated screening). Moreover, it does not require specific medical support for sampling, which could be promising for multiple collects realized by the patient at home, in the perspective of a treatment follow-up, for instance. Stability and reproducibility of this collection mode remain to be studied.
Saliva and urine have more recently presented an interest in biomarker discovery. Their collection is simple, noninvasive, and cheap and can be easily repeated. However, like blood specimen, such factors have to be taken into consideration to improve reproducibility of sample collection. Saliva protein composition varies with circadian rhythm, diet, age, gender, and physiological status [86]. It is also affected by the method of sample collection (stimulated versus nonstimulated saliva production) [60]. Food ingestion increases the proteolysis activity and then collection before lunch rather than after is recommended [73]. The addition of PI can reduce but not completely eliminate the impact of the proteolysis [60]. It will stabilize, qualitatively and quantitatively, the saliva proteome for up to 48 hours [73]. Regarding storage conditions, it is preferred to store the saliva specimens at −80 • C rather than at −20 • C where the preservation of the protein content could not be guaranteed for more than 1 month. Interestingly, repeated freeze-thaw cycles (4/5) do not seem to significantly alter saliva protein profile [74].
Urine has the advantage that it can be obtained in large volume. It is mainly an aqueous solution (95% of water) of waste electrolytes and metabolites, organic components (urea, uric acid), and proteins at low concentrations in healthy individuals (150 mg/day). Urine proteome variation depends mainly on plasma composition due to its role as blood content regulator and on the integrity of the glomerular filtration step leading to a large intra and intersubjects variability. Protein and salt concentrations can vary along the day for a same subject (first void compared to midstream urine samples) [80,88]. Progressive degradation of urine proteome due to proteolytic activity can be prevented by PI addition only up to 2 hours of storage [54]. As already mentioned for blood and saliva, up to 5 freeze/thaw cycles do not significantly affect urine proteome profile. Storage at −80 • C is still requested.
Other fluids such cerebrospinal fluid, nipple aspirates, tears, synovial fluid, bronchoalveolar lavage, follicular, and amniotic fluids have already been explored by SELDI-TOF-MS [15,20,33,34,69,75,89,90]. These fluids are generally used to study well-localized diseases. Despite the presence in such fluids of some plasma proteins, their implication to study systemic diseases is not recommended and difficult to apply in routine diagnosis due to risk and discomfort related to collection.

Sample
Processing. One of the most challenging aspects in studying body fluids protein profiles remains the detection of the deep proteome [91]. The protein concentration dynamic range detectable by means of MALDI-TOF or SELDI-TOF-MS is about 2 orders of magnitude, whereas the range in blood reaches about 10 orders of magnitude [91,92]. As protein binding onto chromatographic surface depends on its affinity, its concentration, but also on the surface binding capacity, one can imagine that the competition between different proteins for binding sites is very complex. A highly abundant protein with low affinity for the chip surface and a low abundant protein with high affinity may give similar peak intensities in the final SELDI mass spectrum. Furthermore, protein steric hindrance can also affect the SELDI profiles.
A major inconvenience for sample fractionation is the resulting low sample throughput capacity, due to a significant increase of the duration of analysis and to a risk of poor reproducibility affecting data treatment. Use of automatized technologies can improve the reproducibility and decrease the total analysis time. Additionally the same proteins can be presented in different fractions challenging the comparison of their abundance between samples.
Several methods have been proposed for fractionation such as centrifugal ultrafiltration, precipitation by organic solvents, electrophoresis, chromatography (on-column or on-magnetic beads), or subcellular localization. The choice will be made based on the nature of the sample to be analysed and the protein properties (molecular weight, localization, abundance, etc.). All these sample preparation methods have already been discussed by other reviewers [85,94,101,103,104]. Recently, with the growing interest in studying posttranslational modifications new methodologies set up to isolate rare amino acid-containing peptides (cys, met, trp, his) or PTM peptides (phosphopeptides, glycopeptides) have been developed [25]. One of the most widely used approach for highly abundant proteins removal in serum and plasma is their depletion using antibodies. Despite the depletion of the nine most abundant proteins from serum or plasma samples, overall published results were quite disappointing [105]. This sensitivity problem is most probably inherent to the too low concentration of the peptidome constituents. Moreover, some of the abundant proteins act as carrier explaining the codepletion of almost 3000 species (peptides and proteins) as observed by several groups [106].
A new fractionation approach has been recently developed by Righetti and Boschetti [107]. It implied a solid-phase combinatorial library of hexapeptides on which millions of copies of a unique ligand are graft on a bead. This technique, enabling the dilution of abundant protein by rapid saturation of its ligand, concentrates components of the deep proteome which could not reach saturation. This method presents the advantage to reduce the dynamic range between the most and less abundant proteins and peptides. It has also been showed that despite compression of the dynamic range, this technology used for differential studies was only applicable for proteins or peptides which do not reach saturation (low and medium abundance proteins) [107]. Many studies conducted on different types of samples report good reproducibility and important gain in the number of low abundant species by comparison with analyses performed on corresponding crude samples [96,98,99,[108][109][110][111], which make this approach very promising to investigate the deep proteome.

SELDI Settings
In order to highlight candidate protein biomarkers, several chromatographic surfaces must be screened. The choice of the protein chip array chemistry and the nature of the matrix depend on whether the application requires general profiling or requires a specific protein assay. Different array types and binding conditions may generate complementary protein profiles for the same sample [7]. The use of relevant quality controls (QCs) is highly recommended and even mandatory in such applications [37][38][39][40][47][48][49][50][51][52][53][54][55][56][57][58][59][60][61][62][63]. QCs should be well-characterized pools of samples processed alongside the experimental samples in order to monitor instrument performances, optimize mass spectrometry settings (laser energy, etc.), compare target protein profiles to those of historical reference samples, and to calculate coefficient of variation for peak intensities as a measure of reproducibility.
It is important to point that the resolution and mass accuracy provided by this kind of instrument are rather low compared to high-resolution mass spectrometers (i.e., Q-TOF, FT-MS, etc.). Using SELDI-TOF-MS, one could not expect to accurately determined m/z values or peak intensities on complex mixtures. Indeed, low resolution causes peaks overlap making abundance and mass assignment difficult. This means that only large differences in peak intensities are to be considered and that peaks of interest have to be identified with more accurate mass spectrometers. Beside those instrumental weaknesses, on the contrary to other mass spectrometers, SELDI-TOF-MS can be used for high throughput analysis.
During SELDI settings, numerous sources of spectra variability have to be taken into account.
Several events, such as matrix crystallization, ion suppression, and in-source decay occurring during mass spectra acquisition strongly influence the peak intensities. These are commented in more details below.

Matrix Crystallization.
Differences in reagents, handling of material, room temperature, and level of humidity may all influence the (co)crystallization step of matrix molecule with sample causing interday fluctuation. The structure and nature of the target surface may also affect peak intensities. These parameters must be highly controlled and standardized for each study protocol. During the crystallization process, a competition phenomenon can occur between proteins for crystal inclusion. Easily embedded proteins will be present at higher concentrations in the matrix and consequently more efficiently desorbed and ionized [47]. To improve sample-to-sample reproducibility of MALDI ion yield and to increase the precision of peptide quantification, some authors use nitrocellulose in order to improve the homogeneity of the matrix/analyte crystallization [55,112,113]. This operation might also be helpful for SELDI-TOF-MS measurements.

Ion Suppression.
Depending on sample composition, ion suppression is another factor that significantly contributes to the variability observed in SELDI-TOF-MS spectra [47,50,55]. Indeed, during ionization, analytes compete for protons that are transferred from matrix molecules. If a protonated analyte collides with an unprotonated one which has higher gas-phase basicity, it may pass its proton to the collision partner. Therefore, the presence of an analyte may reduce the signal intensity of another. This phenomenon is called "ion suppression effect." In a complex protein mixture like serum, where highly abundant proteins constitute a large proportion of the total protein content, it is possible that such peaks override signals from low abundant peptides. This phenomenon, obviously difficult to prevent in complex samples, would be more easily controlled on mixtures issued from fractionation.

In-Source-Decay.
Another source of variations is the fragmentation of proteins or peptides during mass spectrometric process. Fragmentation occurring before the first field-free region is called in-source decay (ISD); it is responsible for consecutive series of ions [114].
Ekblad et al. showed that ISD generates quite additional spectral peaks in the spectrum of proteins contained in serum samples when compared with the data collected for pure reference proteins [114]. One obviously creates ISD favourable conditions when optimizing the analytical conditions by maximizing the total peak count, particularly when using a high laser beam which would increase the thermal ions energy and consequently the number of collisions between ions. Hopefully, in-source fragmentation remains quite limited [114]. Dijkstra et al. developed a method which deconvolutes the spectrum by appropriately associating peaks belonging to the same protein [50]. To take benefit 8 Journal of Biomedicine and Biotechnology of this procedure, highly efficient sample fractionation is recommended.

Miscellaneous.
Other phenomena susceptible to affect SELDI spectra must be considered. Common mechanisms accounting for the arising of multiple peaks in mass spectra include, for example, the formation of salt adducts and multiply charged ions [50]. Chemical reactions using energy from the laser may take place between sample protein molecules, matrix molecules, or molecules from the washing buffers generating intermolecular complexes known as "ions cluster" [50]. The formation of these complexes increases the number of spectrum peaks causing artefacts (i.e., satellite peak at +206 Da corresponds to a SPA adduct). Moreover, the performance of the SELDI-TOF-MS may change over time due to possible fluctuation in the laser intensity and/or detector sensitivity.
All these difficulties can be addressed only by substantial reduction in sample complexity and the application of a rigorous standardization program of the entire analytical process. This involves optimized acquisition protocols (i.e., avoiding too high laser intensity), a fully operational and calibrated instrument and the use of suitable QC samples, similar in nature and complexity to the studied samples.

Spectrum Processing.
Another important methodological source of artefacts is the data analysis of protein profiles. The data preprocessing (calibration, baseline correction, normalization, peak detection, and peak alignment) represents a key step for SELDI analysis [115][116][117].
Spectra are generally normalized in order to equalize or minimize differential effect due to external variation [59,115,116,118,119]. The widely used total ion current (TIC) gives a clear indication of the impacts of technical variables such as laser and detector performances, matrix application, and sample amounts. TIC normalisation relies on the assumption that the technical parameters are mostly responsible for the largest differences observed between samples. But Cairns et al. showed that TIC may also potentially remove some pertinent biological information [115]. They suggest to examine whether normalisation factors vary systematically between study groups and they recommend to specify the applied methodology (local or global normalisation, matrix signal excluded or not). The ideal normalisation procedure would be to resort to some internal spiking method.

Classification Approaches.
One important aspect in SELDI-TOF-MS data analysis is to avoid false discovery of protein peaks, for which the discriminative power results from random variation. A general criticism concerns the use of inadequate algorithms for data analysis and the problem issued from over-adjustment in combination of high-dimensional data with a low number of cases. Those artefacts could be prevented by analysing a sufficient number of samples, by resorting to overfitting-resistant algorithms, by an appropriate validation of the resultant model, and by using optimal spectra processing techniques (calibration, exclusion of spectral regions affected by high noise, peak alignment, and normalization). Two others remarks can be formulated from literature reports: (1) multiple biomarkers have generally a better predictive value than individual markers, and (2) positive-predictive values of peptide patterns are often insufficient to be recognized as early markers when they concern low-frequency diseases in the population [38,53].
The most commonly used bioinformatics approaches are decision tree-based ones and support vector machines [120,121]. Authors generally emphasized on the need for validated model selection using cross validation loop and permutation testing to develop generalized classifier able to correctly predict classification of new samples [122].

Biomarkers: From Identification to Clinical Application
Identification of candidate biomarkers, while not strictly necessary for diagnostic purpose, can be regarded as extremely satisfying in helping to data interpretation and better understanding the disease. As often criticized, the SELDI-TOF-MS technology does not provide peptide/protein identification. In order to succeed in the identification by sequencing (Q-TOF, TOF-TOF, ion-trap, etc.) or peptide fingerprinting (MALDI-TOF), enrichment and purification of the biomarker of interest is often needed, which is laborious and time consuming. To solve in part this weakness, new ProteinChip interface coupled to tandem mass spectrometer was recently developed allowing direct sequencing of peptides <6000 Da [124]. In all cases, identifications must be corroborated using antibody-based detection (i.e., Western blot or ELISA) or antibody pulldown with subsequent detection by SELDI-TOF-MS. It should be noted that the concentration range of widely used biomarkers in plasma samples is remarkably wide and differ from the high milligram until low nanogram per liter range. For example, serum albumin, within a normal concentration range of 35-50 mg/mL, is measured as an indication of severe liver disease [125] or malnutrition [126], whereas IL-6 normally varies in a range of 0-5 pg/mL, is measured as a sensitive indicator of inflammation or infection [127].
Until now, most of the markers identified after an SELDI-TOF-MS study could not yet be considered as very specific of a given disease but they are rather representative of disease's consequences like inflammation or immune response. The most frequently identified proteins so far are haptoglobulin, transthyretin, apolipoproteins, serum amyloid, or complement factors present at μg/mL to mg/mL [13, 19, 23-25, 28, 38]. Although individual acute-phase reactions proteins are not satisfactory diagnosis biomarkers, their combined use with other serum biomarkers may enable more sensitive and specific diagnosis (cf., Figure 3). This phenomenon has recently been termed "host response protein amplification cascade" [122]. Acute-phase proteins could also be directly produced by the disease tissue.  Figure 3: Protein mass spectra collected on CM10 and IMAC-Cu 2+ ProteinChip arrays with serum samples provided by five patients with arthritidies (including rheumatoid arthritis, psoriatic arthritis and ankylosing spondylitis) and five noninflammatory controls (NIC) (including osteoarthritis). (a) The inflammatory-related proteins S100A8, S100A12, S100A9, and one of its variant S100A9 * are arthritis biomarkers detected on CM10 arrays. (b) On IMAC-Cu 2+ ProteinChip arrays, SAA and its 2 variants (SAA-R and SAA-RS) are illustrated, reproduced from [19].
The moderate specificity of SELDI-discovered biomarkers could be explained by its low sensibility. To date, SELDI-TOF-MS has not yet identified any protein marker present at ng/mL level. This probably indicates that the lowest detect limit of this technology is around μg/mL as considered by Diamandis [128]. To overcome this limited detection sensitivity, the serum (or plasma) proteins can be fractionated (cf., Section 3.2) before SELDI-TOF-MS analysis. Fractions could then be loaded on different arrays using complementary binding conditions. Moreover, the decisive advantage of the mass spectrometry technologies is the capacity to detect protein variants, protein fragments, and posttranscriptional modifications (PTMs), which is usually not possible with affinity-based technologies. It is now recognized that those components may be disease-specific and can be considered as potential biomarkers (i.e., modified transthyretin forms in ovarian cancer in Figure 4 and in familial amyloidotic polyneuropathy) [123,129,130].
In the last two years, lots of applications using SELDI-TOF-MS were published for diagnostic of cancers [25,42,44,131], especially breast [10,17], prostate [21,132,133], and colorectal cancer [24]. Other recent papers concerning infectious diseases [22], neurodegenerative disorders [35], renal diseases [26,134], and chronic inflammatory diseases [19,135] also demonstrated the great potential of the technique.  SELDI-TOF-MS technology was also used to predict response to therapy, particularly in cancers. Röcken and Whelan described in detail the use of SELDI-TOF-MS to not only predict responses to cancer therapy but also demonstrate its interest in the follow-up of metastasis disease progression and in the development of drug resistance [44,136]. Recently platelet factor 4 (PF4) appeared to be a biomarker for Infliximab nonresponse in Crohn's disease and rheumatoid arthritis [29,137].
For most of these studies, a validation phase should assess the validity of the described potential biomarkers against a larger and more heterogeneous population of patients. The robustness of the candidate markers has to be tested against a level of biological variability that more accurately reflects the variability in the target population.
Unfortunately, several groups failed to validate the biomarker discovered in their pilot study, such as McLerran et al. [39,40], and others [36,66]. McLerran et al. described preanalytical bias. They concluded that their first study samples most likely had biases in the sample selection. Another validation performed by Engwegen et al. using distinct patient populations confirmed that SAA peak clusters are associated to renal cell carcinoma. However, some other markers could not be validated [36]. Such examples demonstrate the importance to strictly control parameters such as storage, clotting, time of analysis, instrument performances, sample selection, and statistical classification method.
The urgent need for SOPs in clinical proteomics research is therefore absolutely mandatory reflecting a growing trend in the field [19,53,62,63]. Interaction between researchers, clinicians, and statisticians is also a key element for the success.
Altogether, these applications of SELDI-TOF-MS technology illustrate its capability for discrimination and followup of a multitude of diseases using different body fluids as well as certain therapeutic response prediction. It is worth mentioning that FDA approved recently the first diagnostic tool (named OVA1) issued from SELDI proteomic researches. It is made of the combination of 5 markers for ovarian cancer diagnostic.

Concluding Remarks
Taking into account herein and previously described recommendations, SELDI-TOF-MS offers very exciting opportunities to discover not only diagnostic but also prognostic and mechanistic markers for a number of major diseases.
To face the general criticism, standardized procedures and recommendations to minimize bias are now followed by most of the users. However, some challenges still remain, as for all other proteomic approaches, due in part to the complexity and the wide dynamic range of the samples. Sample fractionation and/or enrichment procedure, such as peptide ligand affinity beads, will certainly be the solution to visualise the deep proteome. In addition, improvements in mass spectrometry instrumental performances could be expected (higher resolution, reducing adduct formation, and ion suppression), contributing further to more reliable and faster biomarkers discovery.