Cancer Proteomics: The State of the Art

Now that the human genome has been determined, the field of proteomics is ramping up to tackle the vast protein networks that both control and are controlled by the information encoded by the genome. The study of proteomics should yield an unparalleled understanding of cancer as well as an invaluable new target for therapeutic intervention and markers for early detection. This rapidly expanding field attempts to track the protein interactions responsible for all cellular processes. By careful analysis of these systems, a detailed understanding of the molecular causes and consequences of cancer should emerge. A brief overview of some of the cutting edge technologies employed by this rapidly expanding field is given, along with specific examples of how these technologies are employed. Soon cellular protein networks will be understood at a level that will permit a totally new paradigm of diagnosis and will allow therapy tailored to individual patients and situations.


Introduction
The human genome is now mapped [1,2]. For the first time in history it is possible to take the full measure of human heredity. Genomics is impacting science with seemingly endless possibilities; however, the challenges presented by cancer continue to be quite daunting. Cancer still lacks a definition based on molecu-lar criteria alone, and a completely robust correlation between cancer and DNA based changes has not been found [3]. While the genome provides the underlying blue print of life, an information archive, the proteins do the work of the cell. Most licensed therapeutics and diagnostics work by targeting or analyzing the proteome. The recent publication of the network of protein interactions in the yeast Saccharomyces cervisiae demonstrates the vast complexity buried in protein interaction networks and unrecoverable from the genome itself [4]. In the absence of an understanding of protein changes, the information contained in the genome yields only a limited view of the full repertoire of tantalizing leads for effective new drug targets and markers for early disease detection present in a cell.
The relatively new field of proteomics seeks to expand this view. Just as the genome denotes the entire DNA code of a cell, the proteome denotes the entire protein complement of a cell, quantitatively as well as qualitatively [5]. The aforementioned yeast example is illustrative of proteome complexity. A single network of 1,548 proteins encompassing 2,358 interactions was discovered in addition to several smaller networks [4]. Through concurrent evaluation of multiple variations in this proteome, a better understanding of cell function and miss-function may be gleaned.
Proteomics, from the very core, tackles a greater number of variables than does genomics. There are 20 coded amino acids rather than 4 nucleotides. Multiple copies of individual proteins exist that vary based on cellular environment as well as cell stage. To further complicate the study, no ability to amplify specific proteins outside of increasing transcription is known. The luxury of a PCR like method of amplifying a single protein molecule is nonexistent.
At first glance it might appear that simply monitoring mRNA levels would yield proteomic information. Recently, two pharmaceutical companies have entered into collaboration based on this premise [6]. Hooper et al. demonstrated an example of such a study, analyzing the effect of commensal gut flora on endothelial mRNA expression by mice utilizing cDNA methods. The authors noted that the mRNA levels of 105 transcripts changed greater than two-fold after colonization with B. thetaiotomicron. Seventy-one of these transcripts were assigned to known genes, while 34 transcripts were from uncharacterized genes [7]. While producing very interesting and valuable findings, this approach does not address the actual changes at the protein level. Protein concentration is not determined by mRNA level alone and translation does not occur at the same rate on all mRNA molecules present. Post-translational processing and degradation also run at variable rates [8]. In order to understand proteins and protein interaction, the peptide molecules must be studied directly.

Disease markers for diagnosis and tailored treatment
Although still in its infancy, and currently heavily focused on techniques and methods, the field of proteomics already promises many applications to clinical medicine. These fall under two main categories: the diagnosis of disease states, and the discovery of treatment targets. In the realm of disease marker discovery, major advances are occurring already, thanks to this new field. Examples include the recent documentation of potential new markers for invasive breast carcinoma, including Proliferating Cell Nuclear Antigen and some members of the stress protein family [9], a pair of markers for lung adenocarcinoma TA01/TA02 [10], and the observation by our group of a decrease in Annexin 1 in prostate cancer [11]. As the understanding of protein interaction and networks improves, markers more closely tied to disease specifics should surface. When a more complete understanding of which factors are causative in disease is gained, new targets for treatment should emerge. An early example of new target discovery is the identification of 25 signaling targets of the MAP Kinase pathway by proteomic analysis. Only five of these had been previously characterized as MKK/ERK effectors [12]. While a very early result, this report illustrates the power of proteomics in unraveling the interplay of multiple variables.
The two main categories of diagnosis and treatment targets can be brought together to specifically tailor treatment for individual patients. Tailored treatment is already being utilized in the care of Hodgkin's disease and is being considered for coronary artery disease [13,14]. Rosenthal and Schwartz have published some criteria to be used in establishing links between genomic variations and disease in the field of patient tailored therapy. They require that the change in genetics must cause an alteration at the protein level, the beneficial and harmful phenotypes must have apparent clinical differences, the hypothesis linking genotype and phenotype must be convincing, and the number of exemplary cases must be sufficient to draw conclusions [15]. The field of proteomics is well equipped to satisfy such criteria that all require an evaluation of protein levels and changes. In the future, changes at the protein level alone without definable genomic alterations may be sufficient for individual patient tailored therapy [16,17].

Separation technology
The first step in proteomic evaluation is choosing an appropriate specimen to study. The most easily obtained samples consist of tissue homogenates. These have the advantage of large size, but the disadvantage of heterogeneity. A bulk tumor may consist of cancerous cells, histologically diverse normal cells and stromal components -vasculature, lymphocytes, etc. [16]. The effects of disease may be diluted or masked by the non-cancerous components, and changes in the noncancerous components may be mistaken for disease markers. One method partially circumventing these problems is the production of cell line cultures, which produces a homogeneous population of cells, but may not accurately reflect the proteomic state of the original tumor in the actual patient. A recent study demonstrated only a 20% similarity in the proteomes between cell lines and laser capture microdissected tumor epithelium. In the same study, similarity between tumor and normal tissue obtained from a single patient and even between other patients was near 95% [17].
The ideal material for evaluation should be procured in a patient matched method, isolating tumor and normal tissue from the same specimen when possible. By comparison of such patient matched material the true effects of disease can be isolated from interpersonal differences. When tissue separation is carried out on a microscopic scale, diseased cells can be specifically selected and then compared with specifically selected non-diseased cells from the same individual organ. Such a method of tissue separation has been made possible by the invention of the Laser Capture Microscope (LCM) (Fig. 1) [18,19]. After appropriate fixation and staining of a specimen on a standard microscope slide, the slide is placed on the LCM stage and a region of interest delineated. A cap with a film of low melting temperature plastic is placed over the sample and at the push of a button an adjustable circle of 6-30 micron diameter is melted onto the sample by a laser. When the cap is picked up, the tissue over which the circle was melted is adherent to the cap. The adherent cells can then be lysed by standard methods [18]. The advantages of this technique are numerous. Particularly noteworthy are those that include low energy activation that preserves the cell's original proteome unaltered, accuracy of tissue removal, and very small tissue quantities required for separation. While convenient to use, the procurement of large quantities of tissue on the order of 200,000 cells or more can be temporally prohibitive. Another microscopic tissue dissection system that has been described utilizes UV light to "blow away" tissue that is not of interest. The remaining tissue is then transferred to an appropriate medium by a laser pulse [20].
One method of tissue sampling makes use of a tissue array. Several specimens are placed adjacent to each other on a microscope slide for concurrent evaluation as illustrated by Kanonen et al. Specimens are obtained by core drilling a donor block with a thin walled sharpened stainless steel tube of 0.6 mm diameter. Several such core samples are then placed in an array pattern in a wax block, which is sectioned sequentially, thereby producing a polka dot tissue array on a glass slide. The tissue is then fixed and analyzed by standard immunohistochemical means [21]. The method permits the concurrent evaluation of protein expression in many specimens. However, the heterogeneity of tumor specimens dictates that occasionally samples will be excised which contain no tumor cells at all and the histologic diversity of the sample dilutes the observable effects of disease. Furthermore, analysis of protein content is limited by antigen retrieval, inherent subjectivity of immunohistochemistry and the inability to perform analysis on rare cell types such as microscopic premalignant lesions.
Recently, a new "reverse lysate array" technology by Paweletz et al. (Fig. 2)   The membrane is then probed with antibodies specific to proteins of interest. Illustration is from [22]. tiplexed analyte analysis. The authors applied lysates of microdissected esophageal material by pin array to nitrocellulose membranes. The membranes then were probed by specific antibodies for the phosphorylation status of the signal proteins AKT and ERK illustrating pro-survival pathways at the cancer invasion front. The study demonstrated very high protein sensitivity. Protein quantities found in less then 4 × 10 −4 cell equivalencies were detected. Only about 1000 molecules were necessary for detection and concentration sensitivity was demonstrated through dilution curves [22]. This technology will prove very valuable for detecting low quantities of protein. By using longitudinal patientmatched microdissected material, comparison can be made between normal, low-grade and high-grade premalignant lesions, and diseased tissue from the same patient without masking by histologic diversity. Probing the arrays with antibodies provides specific protein information and arraying of samples permits very high throughput. The technique's only limitations lie in the time required to procure samples and the requirement of antibodies previously made and purified against known proteins. However, once acquired, microdissected cell lysate libraries from as few as 2000 cells can be used to produce several hundred arrays, each of which can be probed with a specific antibody of interest recognizing a new biomarker for early disease detection, surrogate endpoints for therapeutic efficacy, or even a new therapeutic target.
In the absence of specific antibodies, protein separation is one of the most important steps in the entire process. Nonspecifically detectable proteins that are in-separable are unobservable. No method has been found which will separate the proteome in its entirety in a single step. Methods must be used in series to separate specific parts of the proteome for analysis. Techniques such multiplexed tandem liquid and affinity chromatography followed by MS-MS nanoESI mass spectroscopy currently require concentrations of protein which deem the minor components of the human proteome currently undetectable [23]. In the future, however, this technology may ultimately provide a non gel-based solution to proteome mining. Currently, the predominant technique for protein purification and separation in proteomics is currently two-dimensional (2D) gel electrophoresis (Fig. 3). Proteins are first separated along one axis according to charge in an isoelectric focusing step. The gel is then exposed to an electrical gradient over a perpendicular axis along which proteins migrate according to the inverse of their molecular weights [24]. Better separation is achieved than by a traditional gel based on either technique alone, but it is clear that there is not a one to one spot to protein correlation.
Although new approaches such as the development of "zoom gels" to expand the separation range of the technology are increasing resolving capacity, current size limitations and insensitive staining methods place restrictions on required quantities and physical characteristics of the proteins separable by this technique [23]. 2D-gel electrophoresis will probably remain useful in magnification and resolution of specific regions of the proteome, but will always have limitations precluding high-throughput assessment of the entire proteome simultaneously on a single gel format.
In the past, many investigators have analyzed lysates from cell lines [25][26][27][28] and human tissue [29][30][31][32][33][34][35] by 2D-PAGE to look at tumor specific alterations in protein expression for new marker and target discovery. Image databases were developed to map proteins expressed in specific cell types and at defined stages of tumor progression [36][37][38]. All of these annotations are derived from cell type-enriched human tissue.
The recent ability to identify new potential disease markers from actual laser capture microdissected cells from stained human tissue specimens has enabled the analysis of protein expression in not only the affected diseased epithelium, but also the surrounding stroma, normal epithelium, and importantly the premalignant lesions [39][40][41]. 2D-gel profiles from the LCM procured patient-matched normal and cancer epithelium have enabled the discovery of several new potential marker candidates for prostate and esophageal cancers. Intriguingly, these proteins were not detected in stromal cells procured from the same patient tissue sections.
Step 1 Step 2 pI based separation Molecular Mass based separation Fig. 3. Illustration of 2-D gel electrophoresis. In step one, the proteins are separated along a narrow strip of gel on the basis of pI in an isoelectric focusing step. In step two, the gel strip from step one is applied to a larger gel and separation is made based on molecular mass.

Protein analysis technology
Once the proteins are separated, methods of analysis must be brought to bear on the separated entities. Protein sequencing by Edman degradation is the most specific method of analysis, but it requires a large quantity and high purity of protein [42]. Proteins removed from 2D gels are being sequenced by this method since the information available has not been exhausted within the current technical limitations, but other methods are needed.
Mass spectral analysis recently has been explored as a detection and protein identification method. When coupled to separation techniques, the resulting tech-

Particle Ionization
Magnetic Field Detector Fig. 4. Conceptual illustration of mass spectrometry. An ionized particle will have a trajectory through a magnetic field dependent on its mass to charge ratio. By allowing charged particles to pass through such a field a separation is achieved which can be analyzed by a variety of detectors.
nology can be very powerful and will probably be one of the main avenues of future exploration. After appropriate molecule charging, the mass spectrometer instrument detects molecules on the basis of their mass to charge ratio (Fig. 4). In the time of flight motif, molecules are charged and accelerated through an electric field and a recording is made of how long they take to travel a specified distance and strike a detector. The longer the time, the more massive the particle relative to its charge. Mass accuracy in the range of a few parts per million are possible through recent innovations. A more sensitive method consists of monitoring the radio frequency (rf) of a circulating population of charged particles in a cyclotron. Fourier transform of the rf signal yields the individual mass to charge ratios of the members of the population. This technique has an extremely low detection limit, but instruments are currently very expensive [43,44]. Mass spectral analysis can yield sequence information, though the complete sequence cannot be determined in all cases [43][44][45]. Pattern analysis shows which ions or fragments contain ammonia or water losing species. Ammonia can be lost from the N-terminal amino acid, lysine or arginine, and water can be lost from serine or threonine. The fragment containing the N-terminus is identified and computer reconstruction of fragments made. Functionalization of proteins with deuterium or reactive groups with known mass such as acetyl groups are then added and the data acquired used to further specify protein sequence. The comparison of fragment fingerprints with databases of known proteins shortens the entire process considerably. Consequently, as more proteins are discovered and characterized mass spectral analysis will improve.
Another advantage of mass spectrometry is that it can be used as a separating technique allowing analysis of an inhomogeneous sample. Tandem mass spectrom-etry uses the mass spectrometer to isolate an ionized protein. The isolated ionized protein is subsequently fragmented through a second charging cycle and the resulting fragment pattern analyzed for structural information [43][44][45].
Particles must be charged to be observable by mass spectrometry. The charging process has a separating effect, so picking the appropriate method allows detection of variable parts of the proteome. Electrospray excitation is accomplished by putting the molecules in a solvent in which ions are generated. The specimen is sprayed into an electric field under vacuum. In the vacuum, the uncharged solvent evaporates away, gently concentrating charge onto ionizable molecules which are then analyzed [43][44][45]. Matrix-Assisted Laser Desorption time-of-flight (MALDI-TOF) charges particles through excitation of the matrix by a laser [43]. The matrix then transfers energy to the species contained within it. Surface Enhanced Laser Desorption Ionization Time-of-Flight (SELDI-TOF) utilizes a similar phenomenon, except it has a unique protein baiting technology coupled to it on the front end, enabling the selection and purification of classes of proteins up-front before MALDI-based analysis. Investigators have successfully coupled this technology to LCM for the ability to discover new disease marker patterns and perform molecular fingerprinting of stages of human prostate cancer as well as rapid profiling of colon, esophageal, breast, and ovarian cancer. [46]. An example of the results of these studies is shown in Fig. 5. SELDI has recently been used in a variety of applications including protein profiling in a search for soft tissue regeneration genes [47], monitoring Alzheimer's b-amyloid production [48] and analysis of the proto-oncogene TCL1 as an Akt kinase co-activator [49]. Mendrinos et al. also recently used SELDI in the discovery of urine protein biomarkers in bladder cancer patients [50]. Both showing protein patterns that are unique to each human cancer type. A denstigy plot of the mass chromatogram is shown as a protein "bar code". Selected tissue is lysed and the lysate applied to a H4 reverse-phase chip. The chip is then analyzed by SELDI methodology . Illustration is from [46].
MALDI and SELDI are powerful substituents of the growing list of proteomic technologies enabling the discovery of disease markers and therapeutic targets.

Looking back, looking forward
As we look back over the last decade, many changes are apparent in the understanding of microbiology and biochemistry. The level of detail to which the various aspects of normal and aberrant cellular function, cell signaling, respiration, division, and death are understood is many times greater than it was even a few years ago. The explosion in biotechnology and the products produced for detection and treatment of disease is now only beginning in earnest. The completion of the genome project will only serve to expand the coverage and is now ushering in the next step to understanding the cellular basis of disease. Proteomics, because of its unique position for the elucidation of the components that make up the actual molecular targets for therapy and disease markers, stands poised to take up and carry on the progress made to date.