Characterization of Plasma Membrane Proteins from Ovarian Cancer Cells Using Mass Spectrometry

To determine how the repertoire of plasma membrane proteins change with disease state, specifically related to cancer, several methods for preparation of plasma membrane proteins were evaluated. Cultured cells derived from stage IV ovarian tumors were grown to 90% confluence and harvested in buffer containing CHAPS detergent. This preparation was centrifuged at low speed to remove insoluble cellular debris resulting in a crude homogenate. Glycosylated proteins in the crude homogenate were selectively enriched using lectin affinity chromatography. The crude homogenate and the lectin purified sample were prepared for mass spectrometric evaluation. The general procedure for protein identification began with trypsin digestion of protein fractions followed by separation by reversed phase liquid chromatography that was coupled directly to a conventional tandem mass spectrometer (i.e. LCQ ion trap). Mass and fragmentation data for the peptides were searched against a human proteome data base using the informatics program SEQUEST. Using this procedure 398 proteins were identified with high confidence, including receptors, membrane-associated ligands, proteases, phosphatases, as well as structural and adhesion proteins. Results indicate that lectin chromatography provides a select subset of proteins and that the number and quality of the identifications improve as does the confidence of the protein identifications for this subset. These results represent the first step in development of methods to separate and successfully identify plasma membrane proteins from advanced ovarian cancer cells. Further characterization of plasma membrane proteins will contribute to our understanding of the mechanisms underlying progression of this deadly disease and may lead to new targeted interventions as well as new biomarkers for diagnosis.


Introduction
The ability of a given epithelial cell type to adhere to the basement membrane and to associate in the appropriate geometry with neighboring cells of the same type is dependent upon the expression of appropriate plasma membrane proteins governing cell-cell interactions and adhesion. The ability of any cell to interrogate the extracellular environment for cues that modulate the bal-ance between proliferation, differentiation, and apoptosis is also dependent on the array of membrane receptors expressed by that cell type, as well as the regulated availability of autocrine growth factors and proteases that are frequently tethered to the plasma membrane [3]. Thus uncontrolled proliferation, invasion of basement membranes, and the ability to survive as anchorageindependent cells are likely to involve changes in the expression of key membrane-associated proteins [24]. Identification of protein expression patterns associated with stages of malignant transformation has the potential to further our understanding of the basic processes underlying tumor progression, and eventually, to identify potential protein markers of diagnostic or prognos-tic significance [10]. This need is particularly compelling in ovarian carcinoma, as presently there are no good diagnostic markers for early stage detection [6]. As a result, over 75% of the women presenting with ovarian cancer have stage III or IV disease, for which the morbidity is still unacceptably high; 95% of these women will die of their cancer within five years of diagnosis [14,15].
Identifying the changes in membrane protein expression associated with tumorigenesis and progression of ovarian cancer can be approached by analyzing differences in mRNA expression between ovarian tumors and normal ovarian surface epithelial cells, and several groups have used cDNA microarrays to approach this question [13,17,26]. However, since protein function and abundance can be markedly affected by posttranslational mechanisms such as covalent modification (e.g. phosphorylation), trafficking, and proteolysis, approaches that target protein content directly, rather than indirectly via mRNA levels, will provide an increased level of mechanistic information.
Traditional methods for assessing protein expression, whether by two dimensional polyacrylamide gel electrophoresis (2D PAGE) or 1D SDS-PAGE and immunoblotting, are time-consuming and labor intensive. Immunoblotting-based methods require assumptions about which proteins are likely to be differentially expressed, and also depend on the availability of suitable antibodies. 2D PAGE methods are relatively slow and cumbersome, and sensitivity is limited by the large amounts of protein needed to visualize proteins on a gel. In a recent study, Aebersold and coworkers showed that the number of 2D PAGE "spots" is poorly correlated with the number of different proteins actually detected, and that those proteins identified were predominantly expected to be present at high abundance based on their codon bias [7]. In addition, comparison of protein abundances between samples using 2D PAGE generally is based on protein spot intensity; however, this method cannot accurately measure small differences in protein abundance, particularly when spots are poorly resolved. These constraints significantly limit the application of 2D gel-based approaches to clinical analysis of protein expression in human tumor samples, particularly at early stages.
Membrane proteins with important regulatory functions such as receptors and ligands tend to be expressed at low levels, creating a need for a high sensitivity detection system [1,16,18]. Defining the range of plasma membrane proteins expressed by tumor cells presents several technical challenges, including purification of plasma membrane proteins away from other cellular compartments, solubilization of integral membrane proteins, post-translational modifications of membrane proteins including glycosylation and phosphorylation, and compatibility of the plasma membrane preparation with mass spectrometry-based analysis methods [18,21,23]. The primary objective of our research was the development of more effective strategies for identifying large integral membrane proteins from biological specimens based on affinity chromatography. Using this procedure, we have succeeded in identifying a variety of integral membrane proteins, including receptor tyrosine kinases, seven-transmembrane domain receptors, membrane-associated ligands, proteases, and phosphatases, as well as structural and adhesion molecules.

Cell culture and subcellular fractionation
SKOV3 ovarian tumor cells were grown to confluence in 10 cm plates (approximately 10 7 cells), rinsed in serum-free phosphate buffered saline and scraped into lysis buffer (10 mM HEPES, pH 7.5, 250 mM sucrose, 1 mM EDTA, 1 mM EGTA, 2 mM sodium vanadate, 1 mM PMSF, 1% aprotinin, 1% CHAPS) at a ratio of five plates per ml buffer. Cells were lysed by 20 strokes of a Dounce homogenizer B pestle and cellular debris pelleted by centrifugations at 1,000 × g. Typical total protein yields (Bradford assay) were 1 mg protein per 10 cm plate (approximately 5 mg/mL) for the crude plasma membrane lysate.

Two-dimensional gel analysis
Cell lysates were subjected to 2D-PAGE analysis essentially as described by the manufacturer of the 2D analysis system (Amersham Pharmacia). Cell lysates were dialyzed against fresh 8M urea 20 mM tris HCl (pH 7.2), brought to 1% SDS and heat denatured. Samples containing 100 µg of protein in the manufacturer's rehydration buffer were subjected to isoelectric focusing on 4-7 pH strips, followed by SDS-PAGE electrophoresis in 8% acrylamide gels. Proteins were visualized by silver-staining (Pharmacia PlusOne).

Lectin affinity chromatography
Since most plasma membrane proteins are highly glycosylated on their extracellular domains, cell extracts can be significantly enriched for membrane proteins by affinity chromatography on a variety of lectins [12]. The starting material for this procedure was the crude plasma membrane preparation as described above. One ml of the crude plasma membrane preparation containing 5 mg protein, was loaded onto a wheat germ lectin affinity column (Vector Labs) according to the manufacturer's recommended procedure and washed. Analysis of the resulting fractions using the Bradford assay indicated that about 95% of the proteins were present in the flow through and washes. Glycosylated proteins were eluted with 0.5 M N-acetylglucosamine and represented approximately 3% of the protein applied to the column. The eluted proteins were used for mass spectrometric characterization and appear to be highly enriched for membrane proteins, as evidenced by depletion of non-plasma membrane proteins in the MS/MS analysis.

Deglycosylation
To determine whether deglycosylation improves the quality of the mass spectrometric results, we evaluated both enzymatic (N-Glycosidase F used according to the manufacturer's protocol; Roche-Boehringer-Mannheim) and chemical deglycosylation relative to non-deglycosylated controls and compared the quality of the mass spectral data. For the chemical deglycosylation 50-100 µg of protein was dialyzed against 20% formic acid/50% isopropanol/30% water and then lyophilized for 48 hrs to remove as much water as possible from the samples. The sample was placed in a dry ice/ethanol bath and 3% trifluoromethanesulfonic acid (TFMS) dissolved in toluene slowly added. After 15 min reaction time 150 µl pyridine and 400 µl 0.5% ammonium bicarbonate was added to neutralize TFMS. The sample was again dialyzed against 2 M urea/25 mM bicarbonate buffer, pH 7.5 to remove unwanted reaction byproducts. The sample was digested with trypsin (20:1) and then dialyzed against 10% acetonitrile/0.1% acetic acid/0.01% TFA. The resulting peptides were taken to dryness in a speed-vac and redissolved in 25 µl 0.1% acetic acid/0.01%TFA.

Mass spectrometry analysis
The crude and lectin purified protein fractions were digested with trypsin (50:1) and used for identification of Potential Mass Tags (PMTs) by LC-MS/MS as follows: Five µl (10 µg) of digested sample were injected onto the LC-MS/MS system (Agilent capillary LC 1100 and Finnigan LCQ Classic) and analyzed by data-dependent MS/MS. The LC solvents were: A) 0.1% acetic acid and 0.01% trifluoroacetic acid (TFA); and B) 90% acetonitrile with 0.1% acetic acid and 0.01% TFA. The 40 cm capillary (360 µM OD/150 µM ID) columns were packed in-house with Jupiter 5 µm C-18 media (Phenomenex, Torrance, CA). A 120-minute gradient from 0% B to 95% B was used to elute the peptides at a column flow rate of 2.5 µl/min. The three most abundant peaks observed in the first stage of MS found in the mass/charge (m/z) range of 400-2000 were selected for collision-induced dissociation (CID). Peptide dissociation was performed using a 35% relative collision energy. A dynamic exclusion window of 3 min was used after a peptide ion was selected for CID.

Gas-phase fractionation
Improvements in sensitivity and dynamic range for peptide identifications can be achieved by decreasing the size of the m/z window of the mass spectrometer. This technique is referred to as "m/z segmentation" or "gas phase fractionation". To determine whether this approach significantly improves the quality of our plasma membrane analysis, the same sample was injected five times and data was collected for overlapping m/z windows (475-825; 775-1125; 1075-1425; 1375-1725; 1675-2025). Results from this experiment relative to an identical non-segmented run demonstrate that about 3 times as many unique peptides could be identified (data not shown), which translated into a 2.5 fold increase in the number of proteins identified [2]. This improvement increases our confidence in many peptide/protein identifications and increases the number of proteins identifications in general.

Identification of membrane proteins by LC-MS/MS
Identified proteins are a combination of results from these experiments. Tandem mass spectra were analyzed by SEQUEST (ThermoFinnigan) [2,11,16,27]. SEQUEST analyzes experimental spectra with pre-dicted idealized mass spectra generated from a database of protein sequences. These idealized spectra are weighted largely with b and y fragments, i.e. fragmentation at the peptide bond from the N-and C-termini, respectively. The SEQUEST analysis was performed using a modified version of the human.fasta protein database supplied with the software with modifications. Database modifications included the removal of viral proteins and other redundant protein entries as described by Adkins et al. [2]. The SEQUEST parameters were set to search for trypsin fragments (allowing for up to two missed cleavage sites) using the monoisotopic mass for both the parent and fragment ions with a 0.5 m/z window. A minimum ion count of 35 was used. SEQUEST results were filtered according to criteria estabished by Yates et al. [11,22,27,28]. In short, the minimum value of 0.1 was used for DelCN, an indicator of the separation between the peptide with the highest correlation and that with the second best correlation [28]. Peptides singly charged must be tryptic with an Xcorr greater than 1.9. Peptides doubly and triply charged must be tryptic with an Xcorr greater than 2.2 and 3.75, respectively; additional filtering criteria are described in Table 1 (Adkins et. al., manuscript in preparation). Any protein with 4 or more peptides passing the above filters was considered to be a very confident identification. When an individual protein had 3 or fewer peptides passing the above criteria, the mass spectra for those peptides were manually inspected. The final decision on whether the peptide fragmentation was of adequate quality to be considered a hit was based on four types of information [25]. First, the spectrum quality must be good with the ion peaks used in the determination clearly above the noise baseline. Second, there should be continuity within the b or y fragmentation pattern. Third, if a proline was present then the adjacent y fragment would give an intense signal. Last, unidentified intense peaks should be either doubly charged or the parent mass smaller by one or two amino acids.

Characterization of human plasma membrane proteins in silico
The task of identifying proteins with precision in biological samples from humans is significantly complicated by the complexity of the human proteome and the heterogeneity of biological samples. In or-der to gain an understanding of the characteristics of human plasma membrane proteins, we used the publicly available annotated version of the human genome ftp://ftp.ncbi.nih.gov/genomes/hsapiens/proteins/ after removal of redundancies and viral proteins. This resulted in a database of 44,449 proteins. A scan of this database with an algorithm that recognizes regions of hydrophobicity theoretically capable of forming transmembrane domains [TMHMM] identified 5,015 human proteins with one or more potential transmembrane domains indicating that approximately 11% of the human proteome may be membrane associated. The proteins ranged in size from 6 kDa up to approximately 600 kDa with isoelectric points ranging from 3 to 10 ( Fig. 1). To aid in our understanding of how to manipulate membrane proteins, we conducted a virtual tryptic digestion of the 44,449 non-redundant human proteins, generating peptides under conditions of complete trypsin digestion (i.e., no missed cleavage sites). Of these theoretical peptides 75% are between 5 and 50 amino acids in length (molecular weights between ∼500 Da and ∼5,000 Da), the most useful size range for mass spectrometry measurements and protein identification. 99% of all the human proteins contain unique tryptic peptides (by sequence) with an average of 16.6 unique peptides per protein. Nearly identical results were observed for the transmembrane proteins.

2-D gel of plasma membrane proteins from cells derived from stage I and stage IV ovarian tumors
A comparison of plasma membrane preparations obtained from CAOV3 (left) and SKOV3 (right) cells analyzed by 2D electrophoresis and silver staining is shown in Fig. 2. These two tumor cell lines represent distinct stages of ovarian cancer (stage I CAOV3 and stage IV SKOV3) [23]. Differences in plasma membrane protein expression profiles between these two cell lines evident in the 2D gels most likely reflect differences associated with progression of ovarian cancer from a primary tumor to metastatic disease and support the need for global characterization of the plasma membrane proteome.

Protein identifications
SEQUEST analysis of mass spectrometric data of the crude cell lysate and lectin affinity purification resulted in many identified peptides. When these data were filtered using the rigorous criteria outlined in Table 1, 1438 unique peptides were observed, correlating to 398   unique proteins with high confidence. About one-half of these proteins not only met the rigorous SEQUEST criteria but also were identified by multiple peptide hits providing another level of confidence in the identifica-tions. These proteins were classified as plasma membrane, plasma membrane associated, non-membrane, enzymes, and hypothetical proteins (Fig. 3). The proteins listed are a combination of results from several Probable N-methyl-D-aspartate receptor -fruit fly (Drosophila melanogaster) 2135969 Probable olfactory receptor tpcr27 -human (fragment) 7657447 Protocadherin 68 119533 Receptor Protein-Tyrosine Kinase (ERBB-2) 6707663 Retinal-specific atp-binding cassette transporter 4506761 S100 calcium-binding protein A10 5032057 S100 calcium-binding protein A11 4504041 Serotonin Receptor 4502099 Solute carrier family 25 121567 Thyroid Hormone Receptor 5803113 Transmembrane protein (63kD) 5803201 Transmembrane trafficking protein 3041682 Tumor Necrosis Factor Receptor II. Membrane-associated proteins gi|4885199 HLA class I histo antigen, A-11 alpha chain gi|178024 Actin -alpha human (fragment) gi|5453595 Adenyl cyclase-associated protein gi|113950 Annexin A2, Human gi|1345615 Bone morphogenetic protein 1 homolog gi|404105 Catenin -alpha gi461854 Catenin (cadherin-associated protein), beta 1 (88kD) gi|119533 CD36 antigen (thrombospondin receptor)-like 2 gi|5031631 CD36 antigen (thrombospondin receptor)-like 2 gi|116848 Cofilin 1 (non-muscle) gi|1155306 Collagen, type III, alpha 1 gi|122157 Cyclic Nucleotide-gated Channel CNCG3L gi|1017427 Elastic titin -(fragment) Filamin 1 (actin-binding protein-280) gi|4504041 G protein, alpha inhibiting activity polypeptide 2 gi|2137361 GPI-anchored protein gi|121567 GRP 78 (immunoglobulin heavy chain binding pr) gi|3041682 GTP-binding protein G(Y), (alpha-11) gi|231436 HLA class I histo antigen, CW-8 CW*0803 alpha gi|124 Integrin, b1 (fibronectin receptor, antigen CD29) gi|4506787 IQ motif containing GTPase activating protein 1 gi|435476 Keratin 9, cytoskeletal gi|2506805 Laminin alpha 2 subunit precursor gi|112803 Lymphocyte activation antigen 4F2 large subunit gi|114374 Na+/K+ -transporting ATPase alpha-1 chain gi|2501082 Synaptobrevin 3 (cellubrevin) gi|37850 Vimentin gi|4507879 Voltage-dependent anion channel 1 preparation and mass spectrometry runs (Table 2 contains a select list of identified proteins). The proteins identified range from structural proteins known to be highly expressed in SKOV3 cells (actins, vimentin and tubulin) to low abundance osmotic regulators such as the Na + /K + ATPase, a protein tyrosine phosphatase, several G-protein coupled receptors including the metatropic glutamate receptor and acetylcholine receptor. A representative total ion chromatogram along with scans showing mass and fragmentation data for a peptide that maps to EGF receptor is shown in Fig. 4. A total of four EGFR peptides were observed, all with high SE-QUEST XCorr scores. These results demonstrate our ability to detect and identify plasma membrane proteins expressed at low abundance and representing several functionally important classes. When the lectin purified samples were analyzed by LC-MS/MS more proteins were visible than had been seen when the crude lysate was evaluated. While identification of the non-glycosylated proteins decreased, identification of protein classes frequently glycosylated became more pronounced, namely receptors and other transmembrane proteins. The improvement in identification became most apparent for moderate and lowabundance proteins where one or only a few peptides hits were observed.
We also evaluated the mass spectrometry data using less rigorous criteria (i.e., XCorr >1.5 and DelCN > 0.05) for identification purposes. These criteria were selected for a second evaluation of the data since the membrane proteins of interest are known to be present at modest amounts and therefore it is more difficult to obtain high quality fragmentation information. When the data were evaluated in this manner 3,424 peptides were identified that mapped to 1,894 proteins. This database of proteins contained many more mem-brane proteins including receptors than the more rigorous identifications and therefore will be evaluated by Fourier transform ion cyclotron resonance (FTICR) mass spectrometry. These results suggest that a second analysis of the samples using FTICR mass spectrometry to obtain highly accurate mass measurement of the peptides will significantly increase the number of high quality membrane protein identifications.

Deglycosylation
Most membrane proteins are highly glycosylated on their extracellular surface. Glycosylation has the potential to inhibit trypsin digestion and the modified peptides will be difficult to separate reproducibly. In addition, the variability in the masses due to the heterogeneity of the carbohydrate length makes mass spectrometric characterization of these peptides impractical. Because of these complications, methods to remove the carbohydrate moiety from the proteins were implemented prior to analysis of the samples by mass spectrometry. Results from this study demonstrated that chemical deglycosylation was superior to the enzymatic method. With the chemical method, the protein was completely digested after incubation with trypsin while the control and enzymatically digested samples contained large amounts of incompletely digested protein as evidenced by a broad intense peak with retention time of 60-80 min (Fig. 5). In addition the ion intensities for many peptides were significantly increased relative to the non-deglycosylated control. These data suggest that this approach will significantly improve our analysis and has the further advantage of increasing the sensitivity and dynamic range of our LC-MS/MS analysis for protein identification.

Discussion
Identification of plasma membrane proteins from SKOV3 cells is an important first step in identifying new therapeutic targets and biomarkers for ovarian cancer [1]. The goal of these experiments was to identify as many plasma membrane proteins from SKOV3 tumor cells as possible using tandem mass spectrometry. The most common approach is 2D PAGE for separation coupled with mass spectrometry for protein identification; however, 2D PAGE analysis of membrane proteins is associated with significant technical difficulties related to the size, hydrophobicity, and heterogeneity via glycosylation of most membrane proteins. Here we demonstrate an alternate approach where the intact proteins are partially purified by affinity chromatography and then reverse-phase chromatography is used to separate tryptic peptides immediately prior to tandem mass spectrometry. This approach has the advantage of increased dynamic range since the dynamic range is determined by the capabilities of the mass spectrometer rather than the ability to visualize spots on the gel. This approach resulted in high quality identification of nearly 400 proteins. The proteins that were identified, in general, were those that are readily separated from other peptides and with sufficient signal to noise ratio so that their fragmentations patterns are readily deconvoluted by the SEQUEST software. Interestingly, protein identifications vary somewhat between samples and mass spectrometry runs; this variability is most commonly observed with low abundance proteins.
The marked increase in number of proteins tentatively identified under the relaxed criteria illustrates the importance of validating the initial conventional iontrap data with an instrument capable of higher resolution and measured mass accuracy, such as the FTICR-MS instruments available at PNNL [4,8,9]. This approach involves using the PMT (potential mass tag) database prepared from tandem mass spectrometry data that has been sorted using relatively loose criteria. Samples are then rerun under identical chromatographic conditions and peptides masses measured at very high mass accuracy (1-10 ppm measured mass accuracy) using the FTICR-MS. Peptide identities within the PMT database are confirmed (or excluded) using retention times to aid in the analysis. Once a peptide in the PMT database is confirmed by FTICR-MS it is promoted to the AMT (accurate mass tag) database and can be used for future studies without the need for repeating tandem mass spectrometric analysis [19,20]. Using these approaches we expect to confirm many of the plasma membrane protein identifications and move forward with efforts to understand how cell surface proteins change with progression of ovarian disease.