Urine Glycoprotein Profile Reveals Novel Markers for Chronic Kidney Disease

Chronic kidney disease (CKD) is a significant public health problem, and progression to end-stage renal disease leads to dramatic increases in morbidity and mortality. The mechanisms underlying progression of disease are poorly defined, and current noninvasive markers incompletely correlate with disease progression. Therefore, there is a great need for discovering novel markers for CKD. We utilized a glycoproteomic profiling approach to test the hypothesis that the urinary glycoproteome profile from subjects with CKD would be distinct from healthy controls. N-linked glycoproteins were isolated and enriched from the urine of healthy controls and subjects with CKD. This strategy identified several differentially expressed proteins in CKD, including a diverse array of proteins with endopeptidase inhibitor activity, protein binding functions, and acute-phase/immune-stress response activity supporting the proposal that inflammation may play a central role in CKD. Additionally, several of these proteins have been previously linked to kidney disease implicating a mechanistic role in disease pathogenesis. Collectively, our observations suggest that the human urinary glycoproteome may serve as a discovery source for novel mechanism-based biomarkers of CKD.


Introduction
Chronic kidney disease (CKD) affects approximately 11% of the US population with over 100,000 individuals progressing to end-stage renal disease (ESRD) annually [1,2]. Despite this significant and growing public health problem, it remains difficult to predict which individuals will progress to ESRD. As ESRD carries a substantial increase in morbidity and mortality, it is critical to identify this high-risk patient population that would most benefit from early and aggressive therapy.
Current strategies for predicting CKD progression are limited. Pathologic examination of renal tissue provides val-uable information on degree of interstitial fibrosis and predilection for ESRD. However, renal biopsy is invasive with a limited role for longitudinal followup. Quantitative measures of proteinuria have long been used as noninvasive markers of CKD progression [3], yet these largely albumin-based methods detect nonselective proteinuria and incompletely correlate with disease. With recent advances in high through-put technology and mass spectrometry techniques, urine proteomic investigation is an attractive tool in the pursuit for noninvasive and specific markers of CKD progression [4,5].
Numerous investigators have successfully applied broadscale urine proteomic strategies to kidney disease. The urine 2 International Journal of Proteomics proteome predicts nephropathy and decline in renal function in diabetic subjects [6,7]. It also correlates with early changes of focal segmental nephrosclerosis [8], can identify IgA nephropathy and renal allograft rejection [9,10], and predicts treatment response and disease activity in nephrotic syndrome and lupus nephritis [11,12]. Despite these advances, analysis of the entire urine proteome is particularly difficult in CKD. With disruption of the glomerular filtration barrier and leakage of abundant plasma proteins into the urine, a nonselective, largely albumin predominant, pattern often results [13]. To overcome this, methods to increase the detection of low-abundance proteins have been developed to provide disease specificity and clinical relevance of urine profiling and to mechanistically understand factors influencing disease progression. Glycoprotein enrichment techniques allow depletion of albumin and other abundant plasma proteins while providing a more thorough analysis of a subfraction of the urine proteome. As glycosylated proteins are critical for cellular interactions and signaling cascades, disease states are likely to cause early and specific alterations in urinary glycoprotein excretion. Indeed, glycoproteins are now important markers of autoimmunity and malignancy [14,15]. More recently, the plasma glycoproteome has been used to predict nephropathy in diabetic subjects [16]. Despite this promising role as a noninvasive and specific biomarker of disease, little is known about the urinary glycoproteome in CKD.
We hypothesized that the urinary glycoproteome would be altered in CKD compared to healthy controls and that specific glycoprotein alterations might be useful in predicting CKD progression. The overall goal of this study was to perform an initial exploratory analysis of the urine glycoproteins in patients with CKD compared to healthy controls. We present a comprehensive profiling of the urinary glycoproteome in control and CKD subjects utilizing a hydrazide enrichment technique combined with tandem mass spectrometry identification of the glycoproteins.

Sample Collection and
Processing. Clean catch urine samples were obtained from six CKD subjects and six age-matched healthy controls following written informed consent approved by the University of Michigan Institutional Review Board. Samples were stored at −80 • C and thawed Control CKD 08 35 79 Figure 1: Venn diagram of the total urinary glycoproteins detected in healthy controls and CKD subjects. Tryptic digests of urine glycoproteins were subjected to LC-ESI-MS/MS analysis, and the proteins were identified as described in Section 2. 35 proteins were unique to healthy control subjects while 8 proteins were unique to subjects with CKD. 79 proteins were present in both groups.
immediately prior to proteomic analysis. An initial 5000 g centrifugation was performed at 4 • C for 10 minutes to remove cellular debris. Approximately, 30-50 mL healthy control samples and 1-2 mL CKD samples were concentrated using a 3 kDa filter cut-off membrane (Vivaspin 3 kDa MWCO, GE healthcare, Buckinghamshire, UK and Amicon ultra 0.5 mL, Millipore, Ireland resp.). As CKD subjects had higher urinary protein content (Table 1), the processed volumes were lower. Urine protein concentration was determined using Coomassie Protein Assay Reagent with BSA standard (Thermo Scientific, Rockford, Illinois). 200 μg of concentrated protein were utilized for downstream processing. Protein samples were exchanged into 50 mM ammonium bicarbonate buffer (pH 7.4). Urine creatinine concentration was determined by tandem mass spectrometry (MS/MS) as described previously by our group [17]. To determine the level of creatinine, a known amount of [ 2 H 3 ]creatinine was spiked into each sample. A full-scan mass spectrum revealed molecular ions of m/z 114 and 117 for authentic creatinine and [ 2 H 3 ]creatinine, respectively. The transitions of the m/z 114 to 44 and m/z 117 to 47 were monitored in multiple-reaction monitoring mode for authentic and [ 2 H 3 ]creatinine, respectively, utilizing an Agilent Technologies (New Castle, DE) 6410 Triple Quadrupole mass spectrometer system, equipped with an Agilent 1200 series HPLC system. The creatinine concentration in the urine sample was determined by comparing the peak areas for authentic and [ 2 H 3 ]creatinine for the above transitions.

Glycoprotein Separation and Enrichment.
In order to assess recovery following the enrichment procedure, 5 μg of invertase from Saccharomyces cerevisiae (Sigma, St. Louis, MO) was spiked into 200 μg of protein in every sample. Glycoproteins were enriched from urinary proteins utilizing the hydrazide resin capture protocol as described previously by Zhang et al. [18]. Briefly, samples were oxidized with 10 mM sodium metaperiodate then incubated with hydrazide resin overnight at room temperature. Samples were then centrifuged at 3000 g for 2 minutes and the resin was washed successively with equal volumes 50 mM ammonium bicarbonate buffer (pH 7.4; Buffer A) supplemented with 8 M urea, followed by Buffer A alone and then water. The beads were resuspended in water, and the protein was reduced with 5 mm DTT followed by alkylation with 15 mM iodoacetamide. Trypsin (sequencing grade modified trypsin, Promega Corporation, Madison, WI) at 1 : 20 μg ratio was added to the samples and incubated overnight at 37 • C for digestion. Following digestion, the beads were centrifuged at 3000 g for 2 minutes and the resin was then washed successively with 1.5 M NaCl, 80% acetonitrile, 100% methanol, and Buffer A. The resin was then resuspended in Buffer A and incubated with 5 units of PNGaseF (New England Biolabs, Ipswich, MA) overnight at 37 • C for glycopeptide release. The glycopeptides were cleaned using a reverse phase column and  Tryptic digests of urine glycoproteins were subjected to LC-ESI-MS/MS analysis as described in Section 2. The mass spectra of peptides DIVEYYNDSNGSHVLQG from zinc alpha 2 glycoprotein which is upregulated (a) and those of peptide AVLVNNITTGER from Golgi phosphoprotein which is significantly downregulated in CKD subjects (b) are shown. The N-linked glycosylation site of each peptide is depicted in red.
eluted with 50% acetonitrile/0.1% TFA followed by elution with 80% acetonitrile/0.1% TFA. The peptides were then dried at 60 • C in a vacuum centrifuge and stored for mass spectrometric analysis.

Liquid Chromatography Electrospray Ionization (ESI/LC) MS/MS Analysis.
Peptide samples were resuspended in 0.1% formic acid and loaded onto an in-house packed reverse phase separation column (0.075 × 100 mm, MAGIC C18 AQ particles, 5 μm, Michrom Bioresources). The peptides were separated on a 1% acetic acid/acetonitrile gradient system (5-50% acetonitrile for 75 min, followed by a 10 min 95% acetonitrile wash) at a flow rate of ∼300 nl/min. Peptides were directly sprayed onto the MS using a nanospray source. An LTQ Orbitrap XL (Thermo Fisher Scientific, Waltham, MA) was run in automatic mode collecting a high resolution MS scan (FWHM 30,000) followed by datadependent acquisition of MS/MS scans on the 9 most intense ions (relative collision energy ∼35%). Dynamic exclusion was set to collect 2 MS/MS scans on each ion and exclude it for an additional 2 min. Charge state screening was enabled to exclude +1 and undetermined charge states.

Data Processing and Statistical Analysis. The Human
UniProt database (Release 2011-5) was appended with a reverse database, a common contaminant list, and yeast invertase. Raw files were converted to mzXML format and searched against the database using X!Tandem with a kscore plug-in, an open-source search engine developed by the Global Proteome Machine (http://www.thegpm.org/). The search parameters were as follows: (1) precursor mass tolerance window of 100 ppm and fragment mass     , and +0.9840 Da, reflecting the conversion of asparagine in the NxS/T motif to aspartate due to the release of the N-linked glycopeptides from their oligosaccharides. All proteins with a ProteinProphet probability of greater than 0.9 were considered as positive identifications [19]. Only proteins containing peptides with the NxS/T sequence motif were included for statistical analysis.
Baseline characteristics of the control and CKD subjects were compared using Fisher's exact test for categorical variables and Student's t-test for continuous variables. Data is presented as means (±SD). Spectral counts for individual proteins were normalized to Saccharomyces cerevisiae invertase and to urine creatinine content. Spectral counts were compared across the two subject groups using the nonparametric Mann-Whitney test, and P values were adjusted for multiple comparisons using the False Discovery Rate (FDR) with reported q-values. All statistical analyses were performed with the use of SAS software, version 9.2.

Gene Ontology Analysis.
Significant proteins of interest were analyzed using the Gene Ontology Database (Gene Ontology Consortium, http://www.geneontology.org, Princeton University, New Jersey, US; [20]). For a given Gene Ontology (GO) category, the relative enrichment of genes encoding the proteins detected in CKD relative to all reference genes in that category were calculated as previously described using GO Tools made available by the Bioinformatics Group at the Lewis-Sigler Institute (Princeton University, New Jersey, US; [21]). A cutoff value of P < 0.01 was used to report a functional category as significantly overrepresented. To address the multiple comparisons problem that arises when many processes are evaluated simultaneously, the analysis included calculation of the FDR [21]. To improve statistical confidence in our results, all enriched functional categories were required to be significant using both methods (P < 0.01 and FDR < 0.05).

Study Subject Characteristics.
Urine was isolated from six subjects with CKD and six age-matched healthy controls.
Baseline subject characteristics are provided in Table 1. Two important issues were considered with patient selection. First, the etiology of CKD was chosen to be diverse. This would ensure robustness of the putative markers as a CKD marker rather than a disease-specific marker. Second, we specifically targeted early Stage 3 CKD subjects to identify early disease markers that would potentially indicate pathways dysregulated early in the course of disease. This might offer mechanistic insights into disease pathogenesis and progression and have implications in therapeutic strategies. The six subjects had biopsy-proven diabetic nephropathy, lupus nephritis (n = 2), postacute tubular necrosis damage, NSAID nephropathy, and membranoproliferative glomerulonephritis, respectively. The mean estimated glomerular filtration rate (eGFR) was 83 mL/min in control subjects and 52 mL/min in CKD subjects.

Glycoprotein Spectral Count Normalization.
Glycoproteins were extracted and enriched from the twelve urinary samples. To account for variations in the glycoprotein extraction efficiency, 5 μg of the yeast protein invertase from Saccharomyces cerevisiae was added to each sample prior to extraction. After addition to the database, invertase spectral count served as a surrogate marker for extraction efficiency in each individual sample. Invertase spectral counts ranged from 31 to 122 in the twelve samples with an average spectral count of 86 (±31). Each sample was normalized independently to the invertase spectral counts.
To account for intersubject urine concentration variability, spectral counts were then normalized to urine creatinine content. This provides standardization for urinary creatinine excretion and concentration differences which can vary with volume status, stress, diet, activity level, age, gender, and overall health status [22]. Indeed, this normalization is commonly followed in clinical practice where degree of urinary protein is normalized to creatinine to obtain protein excretion rates [23]. Final spectral counts were expressed per mmol creatinine.

Urine Glycoproteome Is Altered in CKD.
Urinary glycoproteins were isolated from six subjects with CKD and six healthy controls using a hydrazide technique as described in Section 2. A total of 122 glycoproteins were identified, of which 35 proteins were unique to healthy control patients, 8 were unique to CKD subjects, and 79 were common proteins in both groups ( Figure 1, Table 2). Unique proteins to the CKD group were Antithrombin-III (SERPINC1), Complement factor H-related 1 (CFHR1), Desmoglein-2 (DSG2), Lumican (LUM), Lymphatic vessel endothelial hyaluronic acid receptor 1 (LYVE1), Pigment epithelium-derived factor (SERPINF1), Thyroxine-binding globulin (SERPINA7), and Zinc-alpha-2-glycoprotein (AZP1). Figure 2 displays MS spectra of two individual glycopeptides with glycosylation motifs which were altered in CKD subjects. Zinc-alpha-2-glycoprotein is significantly upregulated in CKD (Figure 2(a)), while Golgi phosphoprotein is significantly downregulated in CKD (Figure 2(b)). Table 3 displays motifs and specific peptide modifications for all unique 122 proteins. Proteins were only included if the peptides contained the NxS/T motif.
To test if proteins were significantly up-or downregulated in CKD, normalized spectral counts from the 6 CKD subjects were compared with those from the healthy controls. As sample size was small and spectral counts were not normally distributed, comparisons were made with the nonparametric Mann-Whitney test. As 122 proteins were being simultaneously tested, the FDR and corresponding q-values were determined to account for false positive results. Table 4  LGALS3BP, GOLPH2, HP, KNG1, LRG, SERPING1, PTGDS, AZGP1). Incidentally, not all unique proteins to CKD or healthy control groups had statistically significant up-or down-regulation. For example, lumican was not isolated in any healthy control subjects and was found in only three of the six CKD subjects. Thus, lumican is unique to CKD; however, as it was only seen in three CKD subjects, it was not significantly upregulated in CKD via nonparametric testing.

Gene Ontology Analysis Reveals Enrichment for Distinct Biological Functions of Differentially Expressed Urinary
Glycoproteins. The 23 proteins with differential expression in CKD were subjected to a GO Database search and further analyzed with GO Tools [20,21]. GO Term Finder (http://go.princeton.edu/cgi-bin/GOTermFinder) allowed for clustered identification of proteins annotated to specific GO biological process, location, and function classifications. A subsequent GO Term Mapper (http://go.princeton.edu/cgi-bin/GOTermMapper) analysis of significantly altered proteins was performed to bin the proteins to GO parent terms or GO Slim terms (http:// www.geneontology.org/GO.slims.shtml).
GO analysis (Figure 3) for biological processes demonstrated that 16 of the 23 proteins were linked to immune/stress response and biological process regulation (P < 1 × 10 −4 ). 9 of the 23 were acute-phase and inflammatory response proteins (P < 1 × 10 −3 ). Six proteins were regulators of hemostasis, platelet degranulation and coagulation (P < 1 × 10 −4 ), and 10 were involved in  Urinary glycoproteins that were differentially detected in CKD subjects were associated with biological functions using GO process annotations. This approach demonstrated significant overrepresentation of proteins involved in several categories, including regulation of response to stress, platelet activation/hemostasis/coagulation, acute-phase response, regulation of biological processes, localization, secretion, transport, and cell death. A bipartite network (generated using Cytoscape [24]) showing the relationship between GO process annotations (yellow hexagon nodes) and differentially regulated proteins in CKD subjects (white/red/green circular nodes). The size of the GO nodes is proportional to the number of edges (lines) that connect them to proteins. The 10 proteins that are altered with q-value of < 0.05 are depicted in red (up-regulation) and green (down-regulation) in CKD subjects. GAA, 70 kDa lysosomal alpha-glucosidase; APOD, Apolipoprotein D; FETUA: Alpha-2-HS-glycoprotein chain B; ORM1, Alpha-1-acid glycoprotein 1; SERPINC1, Antithrombin-III; GLB1, Beta-galactosidase; CP, Ceruloplasmin; CUBN, Cubilin; EGF, Epidermal growth factor; E9KL23, Epididymis secretory sperm binding protein Li 44a; LGALS3BP, Galectin-3-binding protein; GOLPH2, Golgi phosphoprotein 2; HP, Haptoglobin beta chain; IGHG1, Ig gamma-1 chain C region; IGHG2, Ig gamma-2 chain C region; KNG1, Kininogen 1; LRG, Leucine-rich alpha-2-glycoprotein; SERPING1,Plasma protease C1 inhibitor; PTGDS, Prostaglandin D2 synthase 21 kDa; TF, Transferrin; AMBP, Trypstatin; UMOD, Uromodulin; AZGP1, Zinc-alpha-2-glycoprotein. Ig gamma-2 chain C region 0.0433 0.18 Up localization, transport, and secretion (P < 1 × 10 −4 ). Other processes involved include metal ion homeostasis (4 proteins) and cell death (3 proteins). Table 5 displays function and location for the 23 proteins which were differentially expressed in CKD. 18 out of the 23 proteins localized to the extracellular region consistent with possible extracellular matrix remodeling that typifies renal disease. The analysis also revealed 2 major clusters of molecular function: 20 out of the 23 proteins were involved in binding and protein-protein interactions (P = 5 × 10 −4 ). 5 proteins were endopeptidase inhibitors (P < 1 × 10 −6 ). Collectively, these observations implicate the inflammatory/acute-phase response and extracellular matrix remodeling in CKD. They also strongly support the proposal that glycoproteomic analysis of urine might reveal mechanisms underpinning CKD.

Discussion
CKD is a growing public health problem with dramatic increases in morbidity and mortality following progression to ESRD. Given this, there is a tremendous need for the development of biomarkers to predict CKD progression and allow for early therapeutic intervention. Urine proteomic strategies are now at the forefront of this search due to the sensitivity of MS/MS analysis and the ability to develop noninvasive biomarkers from a readily available biofluid. Significant progress has been made, particularly in diabetes, where urine proteomic analysis can predict nephropathy [6,25,26]. Despite these developments, the majority of proteomic studies have relied on two-dimensional (2D) differential in-gel electrophoresis for protein separation. Resulting samples, particularly in CKD subjects, contain large amounts of highly abundant plasma proteins due to nonspecific leakage through the glomerular filtration barrier. Targeted analyses of low-abundance proteins will likely lead to more disease-specific and clinically relevant protein biomarkers.
We therefore focused our attention on the urinary N-linked glycoproteome. Glycoproteins are an important protein subfraction accounting for up to 50% of the human proteome at any given time [27]. Due to their critical role in cell-cell interactions and signaling cascades, glycoproteins are promising markers for identifying kidney disease activity and progression. In this study we present an initial examination of the urinary N-linked glycoproteome in CKD subjects compared to healthy control subjects. We successfully isolated N-linked glycoproteins from twelve urine samples utilizing a hydrazide capture technique. 122 unique glycosylated proteins were detected amongst the twelve subjects (Table 3). This number is similar to other recent glycoproteome analyses. Ahn et al. recently reported isolating 164-174 unique proteins from human diabetic plasma using a multi-lectin column enrichment technique [16]. Yang et al. isolated 265 urinary glycoproteins from bladder cancer subjects and healthy controls also utilizing a multi-lectin column for enrichment, but larger sample sizes were used than in our current study [15]. These   results support a successful hydrazide based technique for glycoprotein isolation in human urine. Further studies are required to identify optimal extraction strategies. We detected 8 glycoproteins unique to CKD subjects and 35 unique to healthy controls (Table 2). Additionally, of the 122 total proteins identified, 23 glycoproteins were differentially expressed in CKD subjects versus healthy controls. 18 were upregulated in CKD while 5 were downregulated ( Table 4). Many of the differentially expressed proteins have been previously linked to kidney disease supporting a potential role as a CKD biomarker. Two of the most significantly upregulated proteins in our CKD samples were AZGP1 and LRG, both of which are established inflammatory mediators. Alteration of AZGP1 and LRG expression is predictive of acute kidney injury in postsurgical patients [28]. AZGP1 has also been shown to be increased in diabetes and diabetic nephropathy [13,29]. PTGDS, a known extracellular transporter for lipophilic molecules, is formed de novo in renal tubules [30]. PTGDS is upregulated in early diabetes [31] and is a marker of hypertension and latent renal injury [32]. SERPING1, an extracellular matrix regulator, is increased in acute renal allograft rejection perhaps suggesting an important role for collagen remodeling [33]. KNG1, a bradykinin precursor, has also been shown to be upregulated in acute renal allograft rejection [34], and gene variation induces altered aldosterone sensitivity in hypertensive subjects [35]. Interestingly, LUM, a proteoglycan, is a protein unique to CKD but without statistically significant upregulation. Altered regulation of LUM has been linked with abnormal collagen fibril morphology as a mediator of fibrotic disease in diabetic nephropathy [36,37]. CUBN, an apical protein in proximal tubule cells, was unique and downregulated in CKD. Recent investigation supports a role of CUBN in albumin reabsorption with genetic variance at this locus predicting microalbuminuria [38]. The decreased urinary CUBN excretion found in our CKD population may represent a dysfunctional variant or potentially a causative factor responsible for increasing proteinuria.
We used annotations by the GO Consortium and GO Tools to connect the complex array of proteins identified in CKD subjects to biological processes, protein function, and cellular location. Many of the multiprotein pathways differentially expressed in CKD are involved in coagulation, inflammation, and acute-phase response (Table 5, Figure 3). Twenty proteins were linked to protein-protein interactions and binding. Remarkably, there were altered levels of proteins that were involved in acute-phase response and immune/stress response proteins (18 out of 23), implicating a possible mechanistic role for these pathways in CKD. Our detection of the several extracellular proteins and matrix remodeling proteases likely reflects matrix remodeling that occurs in CKD. These findings are consistent with previous literature, as CKD is known to have increased propensity for atherosclerosis, endothelial dysfunction, increased basal inflammation, and altered stress response [39,40].
In this study, we have established normalization techniques which will be essential to future urine glycoproteome analyses. To account for variations in the glycoprotein extraction efficiency of individual samples, yeast invertase (yeast glycoprotein with several glycopeptides) was added to each sample prior to extraction. In this way, glycopeptides derived from invertase serve as an internal marker for the extraction efficiency in each sample. Our samples were also normalized for urine creatinine content. This is of particular importance as marked intersubject variability can exist in creatinine excretion in random urine specimens consistent with different concentrations due to hydration status. Indeed, such normalization would be essential to extrapolate net excretion rates of a given protein in 24 hours and is commonly employed in clinical practice to quantify albumin excretion rates [23].
In summary, we have utilized a hydrazide-based approach to enrich the urinary glycoproteome with subsequent identification of the urinary glycoproteins in a human CKD population for the first time. Our results indicate that urine carries a distinct population of glycoproteins that function in proteinase inhibition, protein binding, and the acute-phase/immune-stress response in subjects with CKD. It will be of interest to study a larger number of subjects to determine whether urinary levels of these proteins might be useful indicators of CKD and to investigate the proposal that these proteins could be markers of disease progression.

CKD:
Chronic kidney disease ESRD: End-stage renal disease FDR: False discovery rate GO: Gene ontology LC-ESI-MS/MS: Liquid chromatography-electrospray ionization tandem MS analysis MS: Mass spectrometry.