lncRNA GAU1 Induces GALNT8 Overexpression and Potentiates Colorectal Cancer Progression

lncRNA is a key epigenetic regulator in biological processes. In the human cancer transcriptome library MiTranscriptome, we identified GAU1 as the top upregulated lncRNA in colorectal cancer (CRC) by sample set enrichment analysis (overexpression ranking percentile = 99.75%, P < 10−50), which is coexpressed with the potential oncogene GALNT8 (Spearman rho = 0.67, P = 2.44 × 10−23, TCGA dataset n = 184). Experimental data revealed that GAU1 regulates the expression of GALNT8. The overexpression of either GAU1 or GALNT8 significantly promotes the cell cycle and proliferation of CRC cell lines and correlates with poor prognosis in patients with CRC (P = 3.04 × 10−2), while silencing of GAU1 or GALNT8 suppressed the cancer cell proliferation and induced the CRC cell line resistance to oxaliplatin in vitro treatment. Our results suggested that the previously less studied GAU1 and GALNT8 may play as CRC prognosis markers and potential targets for chemotherapy treatment.


Introduction
Colorectal cancer (CRC) ranked the third common type of cancer, adding up 10% of all cases [1]. In 2018, there were over one million new cases and over half million deaths from the disease [2]. Genetic mutations in APC [3], TP53 [4], and K-RAS [5] have been intensively studied as major contributors to the tumorigenesis of CRC. Besides, nongenetic risk factors like aging and lifestyle also induce the development of CRC cases. However, this nonmutational alteration in CRC was less studied [6]. Massive parallel sequencing facilitated the genome-wide characterization of the human cancer transcriptome and identified long noncoding RNA (lncRNA) expression as the most common transcriptional alteration in cancer [7]. Our previous reports revealed that lncRNAs are extensively involved in the CRC development [8] and drug resistance [9], indicating that more efforts should be encouraged to identify the CRC-specific lncRNA expression and to link the biological "operator" regulating these noncoding "regulators." RNA-Seq technology empowered by sequence alignment and assembly provides a revolutionary approach for the prediction of full-length transcripts from both the intergenic "gene desert" and protein-coding loci [10,11]. The MiTranscriptome database applied ab initio assembly to 7,256 curated RNA-Seq libraries from tumor, normal tissue, and cell lines so as to provide an unbiased method for gene discovery [12]. Here, by incorporating this ab initio assembly-based human cancer transcriptome database and experimental validation, we identified a colorectal cancer-related lncRNA GAU1 from 12,382 cancer-associated lncRNA transcripts and verified its procancer function as upregulating the mRNA expression of polypeptide N-acetylgalactosaminyl transferase GALNT8, whose overexpression correlates with the cancer cell proliferation and poor patient survival.

Materials and Methods
2.1. Identification of GAU1 as the CRC-Related lncRNA. The normalized counts of 12,382 ab initio-assembled lncRNA transcripts and library information of 6,476 RNA-Seq libraries (5,724 cancer-related samples and 752 normal samples) including 5,602 TCGA cases were downloaded from the MiTranscriptome website (http://MiTranscriptome.org/ download/MiTranscriptome.expr.counts.tsv.gz).
Sample set enrichment analysis (SSEA) [12] was performed to test if a transcript is differentially expressed between the cancer and noncancer samples in an empirical ranking method. In brief, a weighted KS test was performed as gene set enrichment analysis (GSEA) [13] to generate the enrichment score (ES) describing the enrichment of the sample set among all tested samples. SSEA was further performed 1,000 times with random permutation of the ample labels for a set of null ES and the nominal P value of relative rank of observed ES within the null ES. The hypothesis testing was performed by comparing the tested ES to the null normalized enrichment score (NES) for all transcripts in a sample set. SSEA percentile score was generated by ranking the transcripts in each analysis by their NES. The tissue-type information of each transcript was obtained from the MiTranscriptome browser (http://MiTranscriptome.org).
To perform GAU1 coexpression analysis, the normalized RSEM-FPKM mRNA expression of 382 TCGA CRC samples was obtained from TCGA firehose legacy (https://gdac .broadinstitute.org/). After sample overlapping with the MiTranscriptome database, Spearman's rank correlation coefficient of GAU1 and all 19,815 protein-coding gene mRNA expression was calculated in 184 TCGA CRC samples.

Clinical Samples and Tissue
Microarray. Primary CRC tissues and paired adjacent tissues were collected from 66 CRC patients. All these samples were obtained between 2015 and 2017 and stored at -80°C.
Tissue microarrays (TMAs) with 55 paired cases of CRC and adjacent nontumorous tissues, plus 14 individual CRC tissues, were obtained from Shanghai Tenth Hospital (Shanghai, PR China). These CRC specimens were collected from CRC patients between 2010 and 2015 and followed until April 2019. No patient received chemotherapy or radiation before surgery, and no other concurrent cancer was observed in the patients. Both the Institutional Review Boards of Shanghai Tenth Hospital and Huashan Hospital, Fudan University, approved our study in compliance with Helsinki Declaration of 1975 as revised in 1996. All patients signed the informed consent before surgical operation. The clinical stages were classified by the American Joint Committee on Cancer and Union for International Cancer Control (AJCC/UICC) classification system [14]. Overall survival (OS) is defined as the time interval between the date of surgery and death.
2.3. Cell Culture and Stable Cell Line Establishment. Human embryonic kidney cell line HEK293T and human colon/rectum cancer cell lines LoVo, DLD1, SW620, and HCT116 were purchased from Shanghai Institute of Biological Sci-ences. All cell lines were maintained in Dulbecco's modified Eagle's medium (DMEM) (Gibco, CA, USA) with 10% FBS (Gibco) at 37°C in an atmosphere of 5% CO 2 .

Cell Proliferation and Cell
Cycle Assay. Cancer cell lines were seeded 1 × 10 3 per well in the 96-well plate. The cell proliferation was assessed by Cell Counting Kit-8 (CCK8, MCE, #HY-K0301) every 24 hours for 5 days. The colony formation ability of cancer cell lines was measured by 0.1% crystal violet/methanol staining 10 days after cell seeding in six-well plates at 1 × 10 3 per well density. Any colony that contains more than 50 cells was counted.
Cell cycle analysis was performed with Propidium Iodide (PI) staining. A total of 10 6 cells were rinsed twice with cold PBS, then fixed with 75% ethanol overnight at -20°C, rinsed three times with PBS, and resuspended with 0.5 ml FxCycle™ PI/RNase Staining Solution (Life Technologies, #F10797). Keep the cell suspension for 15 min in the dark, and immediately subject to flow cytometry analysis on a FACSCanto system (BD Biosciences).
2.6. Quantitative Real-Time PCR. Trizol reagent (Invitrogen) was used for total RNA of tissues or cell extraction. Reverse transcription was performed with PrimeScript™ RT Reagent Kit (TaKaRa Biotechnology, #RR047A). Quantitative real-time PCR was conducted with TB Green Premix (TaKaRa Biotechnology, #RR820A) and gene-specific primers (Table 1) on an Applied Biosystems 7500 system (ABI); β-actin was used as a mRNA expression housekeeping gene (Table 1). Relative expression of GALNT8 and GAU1 was calculated with the 2 -ΔΔCt method. The denaturized protein samples were resolved by SDS-PAGE and transferred onto polyvinylidene fluoride (PVDF) membranes (Millipore, #ISEQ00010). Blocked with 5% skimmed milk in PBST, the PVDF membranes were incubated with specific primary antibodies overnight at 4°C. After 3 times of 10-minute TBST buffer rinsing, the membranes were again incubated with secondary antibodies for 1 hour at room temperature and rinsed 3 times with TBST buffer for 10 minutes. Signals were detected with enhanced chemiluminescence (ECL) substrate (ThermoFisher, #32106) on a Las-3000 Luminescent Image Analyzer (Fujifilm, Japan).
2.9. TMA Staining and Immunohistochemistry. The TMA slide was air-dried at 60°C for an hour and treated with 0.01 M citric acid buffer solution for antigen retrieval. After cooling down to room temperature, the slide was further treated by 3% H 2 O 2 solution in methanol for 10 minutes and rinsed 3 times with cold PBS before incubation with primary anti-GALNT8 antibody (1 : 100) at 4°C overnight. The slides were rinsed three times for 5 minutes and then incubated with ready-to-use biotinylated goat anti-rabbit IgG (Abcam, #ab64256) solution for 15 minutes at room temperature, followed by PBS rinsing for five times. Streptavidin peroxidase complex (Abcam, #ab64269) was applied to the TMA and incubated for 10 minutes at room temperature and rinsed by PBS for five times. After visualization with diaminobenzidine chromogen (Abcam, #ab64238) and hematoxylin counterstaining, the TMA was imagined using a Nikon Eclipse E-800 microscope. The stained TMA was then independently reviewed by two pathologists and rated for the grade of GALNT8 staining with scores of -, +/-, +, ++, and +++.

Cytotoxic Assay.
For SW620 and DLD1, the cells with manipulated GAU1 expression or control were seeded in the 96-well plate at a density of 1 × 10 3 cells per well and incubated with low serum medium (1% v/v FBS) with or without oxaliplatin. Cells were replenished with fresh low serum medium with or without oxaliplatin on the third day. Cell Counting Kit-8 (CCK8, MCE, #HY-K0301) assay was used to estimate the cell viability at the end of the fifth day of treatment.

GAU1 Overexpression Facilitates CRC Cell Proliferation by Promoting Cell Cycle.
To further determine if the GAU1 overexpression can alter the biological phenotype of CRC, we first established the GAU1-overexpressing stable cell lines by lentiviral infection of pCDH-GAU1 in SW620 and HCT116 cell lines with intermediate GAU1 expression. The CCK-8 and clonogenic assays both revealed that GAU1 overexpression lead to a significantly increased cell proliferation in the CRC cell lines compared to the vehicle controls (Figures 2(a) and 2(b)). Consistently, GAU1 knockdown in the GAU1 high-expressing LoVo and DLD1 cell lines by short interfering RNA (siRNA) significantly reduced the cell proliferation and clonogenic ability of the CRC cells (Figures 2(c) and 2(d)). These data suggested that GAU1 overexpression promotes CRC cell proliferation in vitro. Moreover, the cell cycle profile alteration after GAU1 overex-pression (increased S-phase commitment) (Figure 2(e)) also implied GAU1 as a critical player in promoting S-phase entry.
Moreover, in contrast to human intestinal epithelial cell line, a higher expression of GALNT8 in CRC cells was observed in both mRNA and protein levels (Figure 3(g)).
To further confirm the regulatory effect of GAU1 on GALNT8 expression, the effect of GAU1 knockdown/overexpression on the expression levels of GALNT8 in CRC cells was determined. The mRNA and protein expression levels of GALNT8 were increased in the GAU1 overexpression cell lines and decreased in the siGAU1 cell lines compared with the control group (Figures 3(h) and 3(i)). Altogether, the

The Oncogenic Ability of GAU1 Is GALNT8 Dependent.
Since the relationship between GALNT8 and cancer is limited, we experimentally manipulated the expression of GALNT8 by lentiviral stable overexpression and siRNA interference. CCK-8 and colony forming assays demonstrated that the overexpression of GALNT8 enhanced the proliferation and colony formation capacity of SW620 and HCT116 (Figures 4(a) and 4(b)), whereas the contrary results were observed in the GALNT8-suppressed DLD1 and LoVo cell lines (Figures 4(c) and 4(d)). With all these results, it is suggested that GALNT8 contributes to CRC cell proliferation.
To further explore the oncogenic partnership of the GAU1/GALNT8 cluster in CRC, siGALNT8 or negative control was transfected into GAU1-overexpressing cell lines to examine whether GALNT8 silence could rescue GAU1 overexpression-mediated enhanced proliferation of CRC.
The CCK-8 and colony formation assay results demonstrated that the upregulated cell proliferation and colony formation in the GAU1-overexpressed SW620 and HCT116 cell lines were partially attenuated by siGALNT8 in Figures 4(e) and 4(f)), suggesting that GALNT8 is a critical downstream operator of GAU1 during the CRC proliferation.
10 Gastroenterology Research and Practice further confirmed by the drug resistance phenotype in GAU1/GALNT8 knockdown cell lines (Figure 4(g)).

Discussion
CRC is one of the most common and lethal types of cancer [16]. In the past decades, genetic alteration including APC and K-RAS somatic mutation has been identified to cause 70% of the CRC cases [17] and widely adapted into the diagnosis and drug response prediction during CRC management [18]. Recent studies attributed the transcriptional alteration of the lncRNAs as a hallmark of tumor development [19,20]. The enormous efforts on the landscaping of lncRNA expression in cancer [21,22] led to a number of fabulous investigations that improved the understanding of multiple major cancer types [7].
In this study, we identified GAU1 as one of the major oncogenic lncRNAs for CRC by mining the ab initial strategy-based lncRNA database MiTranscriptome [10,11]. According to our analysis, GAU1 ranked one of the most differentially expressed lncRNAs between CRCs and normal tissues/cell lines (99.75% percentile of SSEA). Moreover, the overexpression of GAU1 leads to a significant reduction in CRC patient survival (P = 3:04 × 10 −2 ). After experimentally validating the procancerous ability of GAU1 by the cell proliferation assay after GAU1 expression manipulation in CRC cell lines, we further located GALNT8 as the mostly coexpressed protein-coding gene for GAU1.
GALNT8 encodes a 637-amino-acid type-II membrane protein (GalNAc-T8) [23]. The protein is a member of the UDP-GalNAc polypeptide N-acetylgalactosaminyl transferase (ppGaNTase) family, which initiates mucin-like O-linked protein glycosylation in the Golgi apparatus [24]. Previous research revealed that GALNT8 is expressed in the heart, placenta, skeletal muscle, liver, and kidney and plays a key role during embryonic development [23]. However, the oncogenic effect of GALNT8 is less characterized. Chai et al. reported GALNT8 as the oncogene in retinoblastoma that potentially drives the cancer development and progression [25] by directly binding to the GALNT8 promoter and boost the transcription of GALNT8 through TCEA1 (Transcription Elongation Factor A1) recruitment, which mechanistically endorsed our experimental data in CRC.
Like GAU1, GALNT8 is also associated with poor CRC prognosis (P = 0:31 × 10 −2 ). Together with the experimental evidence (1) overexpression or silencing GALNT8 mimicked the cancer cell line phenotypic alteration after GAU1 overexpression or knockout. (2) GALNT8 knockdown attenuated the GAU1 overexpression-induced cell proliferation, and not vice versa; we confirmed GALNT8 as the downstream operator of GAU1 in CRC.
Aside from the surgical operation, systemic chemotherapy with folinic acid, fluorouracil, and oxaliplatin (FOLFOX) is also a main treatment solution for CRC. Our result showed an oxaliplatin hypersensitivity in cancer cell lines overexpressing GAU1/GALNT8. This double-edge sword effect of GAU1/GALNT8 overexpression suggested the GAU1/-GALNT8 axis as a potential marker in the precision medicine of CRC, although more experimental evidence should be investigated in the future.
One limitation of this study is we did not provide the molecular interaction between GAU1 and GALNT8. Although we have confirmed GALNT8 as the essential operator for the oncogenic ability of GAU1, further investigation on the regulatory mechanism between these bidirectionally transcribed lncRNA/protein-coding gene pairs needs to be clarified by protein-RNA interaction or DNA-RNA binding assay. According to the previous report that GAU1 and GALNT8 share a cisregulation relationship in retinoblastoma [25] and the mutual promoter region of the two genes, investigation on the mechanism behind the abnormal promoter activation in CRC should be conducted in our future studies.
To our best knowledge, this is the first study systematically reporting the oncogenic cascade of GAU1/GALNT8 axis in CRC. By integrating the differential expression data from 7,256 curated RNA-Seq libraries in MiTranscriptome and experimental validation, we demonstrated that GAU1, together with its downstream protein GALNT8, is associated with cancer cell proliferation, poor patient survival, and chemotherapy response.

Data Availability
The data used to support the findings of this study are included within the article.

Conflicts of Interest
The author(s) declare(s) that they have no conflicts of interest.