A Bioinformatic Profile of Gene Expression of Colorectal Carcinoma Derived Organoids

Colorectal carcinoma is one of the common cancers in human. It has been intensely debated whether the in vitro cancer cell lines are closely enough for recapitulating the original tumor in understanding the molecular characteristic of CRC. Organoid as a new in vitro 3D culture system has sprang out in CRC study for the capability in reviving the original tissue. The aim of this study is to profile the gene expression of CRC organoid. The gene expression GSE64392 was from GEO database contained 20-patients-derived 37 organoid samples, including 22 colorectal tumor organoid samples and 15 paired healthy samples. Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) were applied for classifying differentially expressed genes (DEGs). Protein interaction among DEGs was analyzed by Search Tool for the Retrieval of Interacting Genes (STRING) and Cytoscape software. In total, 853 gene sequences were identified. GO analysis revealed that DEGs were extensively involved in various biological process (BP), like proliferation, cell cycle, and biosynthesis. KEEG pathway analysis showed that WNT, MAPK, TGF-β, SHH, ECM-receptor interaction, and FGF pathways were altered. DEGs which were identified with protein interactions were major response for extracellular matrix organization and the GPCR pathway. In conclusion, our study profiled the DEGs in CRC organoids and promotes our understanding of the CRC organoids as a new model for colorectal cancer research.


Introduction
Colorectal carcinoma (CRC) is one of the major cancers and is a contributor to cancer mortality and mobility in human. A variety of studies have revealed critical mutations of genes and the dysregulation of signaling pathways is important for the development of CRC [1]. Nevertheless, like other cancers, CRC presents the instability in genome, which usually leads to the diversity of cancer cell phenotype [1]. The genomic instability consisting of gene mutation and chromosomal hyperchange has been investigated as an important contributor of CRCs [1,2]. To date, cancer cell lines are still mainly used in tumor research, as the accessibility and ease in manipulation [3], whereas cancer cell lines can be representative of tumors as an in vitro system is controversial [4,5]. Recently, the Cancer Cell Line Encyclopedia (CCLE) characterized nearly 1,000 cancer cell lines via larger-scale genomic application [6]. Most cancer cell lines exhibited a relatively positive correlation in representing the original tumors from which cancer cell lines were derived. However, most cancer cell lines were derived from highly aggressive and fast growing tumor [3]; they tend to possess more genomic alterations than primary tumor, leading to be partial in representing the initiation or development of tumor [3]. Cancer cell line apparently limits in representing clinical attributes, like diagnosis, drug response, and treatment. A recent developed 3D culture system, organoid technology, demonstrated the maintenance of primary crypt physiology [7]. Then a long-term culture system was established for human intestinal and colonic epithelium organoid, indicating the application in development, pharmacology, and tumorigenesis of colon [8]. In addition, a large scale sequencing has been performed to characterize the developmental lineage tree on the organoid platform, reviving several features of normal mouse development [9]. Most recently, an established organoid bank of CRC patients resembled the primary tumor tissue physiologically [10]. Moreover, the genomic feature of tumor organoid mimics the primary tissue extensively [10]. Thus, analyzing the gene expression profile and the interaction of differentially expressed genes (DEGs) network of CRC organoid is crucial for understanding the biomedical application of organoid technology in CRCs and sustains that organoid is promising in personalized CRC therapy.
In this study, we analyzed gene expression profiles of healthy and tumor organoids of CRC patients with the GEO2R supported by Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/). Subsequently, the DEGs were subjected to DAVID, to perform the gene ontology (GO) and pathway enrichment analysis. Then, we investigated the protein interaction among the DEGs. Our study may provide insight of organoids derived from patients as a potential 3D system for investigating the development of CRCs.

Gene Expression Profile Analysis.
Differentially expressed genes were analyzed with the GEO2R, which accompanies with the GEO dataset and is supported by GEO database and available at https://www.ncbi.nlm.nih.gov/geo. Data were analyzed with default parameters. Genes with Log2-fold change between tumor and healthy samples ≥1 or ≤-1 were classified as differentially expressed genes (DEGs) and the adjusted P-value (adj.P.Val) < 0.05 was considered as statistical significance.

Gene Ontology and Pathway
Analysis. Database for Annotation, Visualization, and Integrated Discovery (DAVID: https://david.ncifcrf.gov/) is a web application integrated various annotation sources including Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) and is essential for the interpretation of high-throughput datasets. DAVID was applied for analyzing the enrichment of GO and KEGG pathway of DEGs. P<0.05 was considered as statistical significance.

Protein-Protein Interactions (PPIs) Analysis.
Search Tool for the Retrieval of Interacting Genes (STRING) is a webaccessible database of protein-protein interactions (PPIs). STRING (version 10.5) currently covers 9 643 763 proteins from 2 031 organisms, including Homo sapiens. To evaluate the protein associations among DEGs, we mapped the DEGs to STRING; interactions with combined score ≥ 0.4 (medium confidence) were considered significant. Cytoscape (3.5.1) was used to visualize the interaction network. The plugin Molecular Complex Detection (MCODE) was performed to screen the network, with the MCODE score >3. Pathway enrichment was analyzed for clusters; P<0.05 was considered as significant difference.

Differentially Expressed Genes.
Tissue derived organoids profoundly preserved the basic morphology and organization of primary tissues [7,8,10]. Moreover, tumor derived organoids profoundly revealed the genomic features of primary tumors [10]. Marc van de Wetering et al. compared the transcriptome profile of organoids with paired tumor tissues from nine patients. The gene expression profile of organoids displayed a high correlation (Pearson correlation 0.918 ± 0.040) with original paired biopsies, suggesting that organoids successfully recaptured the primary tumors on the scale of gene expressions [10]. Thus, it is grounded to analyze the organoids-based transcriptome profile. In this study, the total 37 organoid samples consisting with 22 tumors and 15 healthy samples were analyzed. Based on the GEO2R analysis, 853 gene sequences were identified. Only genes with GeneBank accession number according to National Center for Biotechnology Information (NCBI) database were listed as differentially expressed genes. Thus, a total of 405 genes were classified, 100 genes were upregulated in tumor organoid samples, and 305 genes were downregulated (Data not shown). The expression of top 50 upregulated and top 50 downregulated genes is listed (Figure 1).

Gene Ontology (GO) Enrichment
Analysis. GO analysis was performed with the DAVID web application. Biological processes (BP) essential for the tumorigenesis exhibited the extensive mRNA expression change in tumor organoid samples. We grouped cell cycle, cell proliferation, and growth to the capability of tumor for doubling its population; both upregulated and downregulated genes enriched in tumor population process (Table 1). In addition, in tumor samples gene expression alteration was found enriched in metastasis and angiogenesis which are characters of cancer [11] (Table 1). Upregulated genes also enriched in the process of biosynthesis. Processes which are important for the survival of cancer cells including cell death evasion, inflammation process, and immune system all exhibited the genes downregulation (Table 1). In addition, downregulated genes showed enrichment in extracellular matrix organization, homeostasis, and secretion process. For cell component (CC), hyperexpressed genes only enriched in the plasma membrane. However, genes those downregulated in tumor samples were found in various aspects of cells, which could be mainly grouped to extracellular part, cell-cell communication (cell junction), plasma membrane, and cytoplasm organelles ( Table 2). For molecular function (MF), overexpressed genes exhibited significant enrichments in nucleotide binding, including DNA binding and RNA polymerase activity (Table 3). Genes exhibiting the decrease in mRNA expression dramatically enriched in signaling molecular binding, including ion binding and ligand-receptor binding (Table 3).

KEGG Pathway Analysis.
In the work by Marc van de Wetering et al., it is presented that the mutation rates per Mb of tumor organoids exhibited the similarity of paired biopsies. Mutations in organoids were predominantly CpG to T transitions, consistent with original tumor tissues. Moreover, somatic variants within the coding regions in organoids were highly concordant with the corresponding biopsies for both hypermutated and nonhypermutated patients (median = 0.88 frequency of concordance, range 0.62-1.00). Furthermore, combine the analysis of somatic copy number alterations (SCNAs) and single nucleotide variants (SNVs) to infer Cancer Fractions (CCF) between tumor organoids and biopsies, revealing that CRC driver mutations were maintained in organoids and most commonly altered genes and were all represented in organoids, including APC, TP53, KRAS, PIK3CA, FBXW7, and SMAD4. These results suggested that there were no distinct CRC driver mutations in organoids which may lead to the variation in pathways. Next, in order to identify the pathway enrichment of DEGs in tumor organoid samples, we investigated the KEGG pathway enrichment of DEGs using DAVID (Table 4). Thus, alteration of gene expression can be found in well-defined CRC related pathway including WNT, MAPK, and TGF-pathways [1,2]. We also grouped SHH, ECM-receptor interaction, and FGF signaling pathways. Hyperexpressed genes were also found to regulate cell adhesion. Interestingly, three upregulated genes in colorectal cancer organoid samples were also associated with basal cell carcinoma. Repressed genes were found to be tightly associated with Rap1 signaling and regulating Pantothenate and CoA biosynthesis, cytoskeleton, and reninangiotensin system. In addition, downregulated genes were indicated to contribute the formation of bladder cancer in human.

Protein Interaction Network Analysis.
Protein-protein interactions (PPIs) are critical for the signaling transduction and the biological process. Thus, to investigate the protein interactions among the DEGs, we screened the protein interaction within the STRING database. Nodes with combined     score ≥ 0.4 (medium confidence) were subjected to Cytoscape to visualize the protein interaction ( Figure 2). Total nodes were subjected to analyze the interaction modules by using MCODE. Four modules were extracted from the network (Figures 3(a)∼3(d)). Pathway enrichment analysis of modules showed that proteins were mainly involved in extracellular matrix organization and the GPCR signaling pathway.

Discussion
Colorectal cancer development is a complex process of accumulation of genetic mutations, epigenetic alteration, and dysregulation of signaling pathways [1]. Despite numerous progress has been made in elucidating the molecular mechanism of the development of CRC, the underlying mechanism remains vagueness. Cancer cell lines have been applied as the main workforce for cancer research [3,6]. Nevertheless, like other tumors, CRC exhibits a high genomic instability which contributes to the various phenotypes and pathologies of CRCs [2]. Cancer cell lines limit in the capability of resembling the characteristics of the primary tumor tissue. Studies have reported the difference in gene expression and genomic alteration between cell line and tumors [4,5,12,13]. Resulting from the identification of Lgr5+ stem cell in intestine and colon [14], a new 3D in vitro system, organoid [7], has made great impact on the colorectal research [10,15,16]. As organoids resemble the characteristics of primary tissue, organoids have been widely applied for studying organ development, tissue homeostasis, tumorigenesis, and disease [17][18][19]. As a promising ex vivo 3D culture system for studying CRC, profiling genes expression level of CRC organoids is important for understanding the development of CRC. In the present study, we analyzed the data from GSE64392 and identified 100 upregulated genes and 305 downregulated genes in tumor organoids. DEGs were identified to involve in main capabilities of cancer, including cell renew, metastasis, angiogenesis, and cell death escaping. By analyzing the protein interaction of DEGs, we classified genes that may provide new insight for understanding the development of CRC.
In order to have a better view about the function of DEGs, GO analysis and KEEG pathway were performed. Upregulated genes were mainly involved in biological processes consisting of cell cycle, the cell proliferation controlling, tumor metastasis, and angiogenesis, which are essential for the sustentation of tumor [11]. For downregulated genes, in addition to involving the retaining of tumor, genes were also accumulated in biological processes which serves as barriers to cancer [11]. For example, programmed cell death, apoptosis, which is the most important physiologic processes in body to eliminate deteriorated cells [11], was another downregulated gene-enriched biological process. Moreover, downregulated genes were also found to participate in the following process, like extracellular matrix (ECM) organization, secretion process, immune-response, and homeostasis, alteration of which is acquired by cancer to escape the  ASB4   MMP1   FGF2   THBS1   EFNA5 TFAP2C   FBN2   CD44  HSPG2   ANGPTL4   CNNM1  TFF1  FADS2  RASEF  ZIC2  KLK10  ANO5   ZIC5  TFF3  KLK7  UBA7 NDNF STX19 CRABP2 MYOF Figure 2: The protein-protein interaction (PPI) network of differentially expressed genes (DEGs).
tethering of surrounding tissue and the chasing of immune system [11]. Organoid culture system has been proved to recapitulate the in vivo counterpart [20]; specifically, established CRC organoid cultures displayed a highly agreement with the primary tumor tissue, in genomic mutation, chromosomal alterations, and epigenetic modifications [1,2,10]. The KEGG pathway enrichment analysis revealed that genes in WNT pathway are upregulated, including AXIN2, DKK4, and NKD1, which are target of WNT/ -catenin signaling pathway and also antagonize WNT pathway. In addition, LEF1, which activates gene transcription in the axis of WNT signaling, was upregulated as well. The hyperexpression of WNT target genes corresponds with the hyperactivated WNT pathway in primary CRC and CRC cancer cell lines, most likely resulting from the biallelic inactivation mutation of APC, FBXW7, AXIN2, and FAM123B or the activating mutations in CTNNB1 [2,6]. Besides, overexpressed genes were identified to involve in other pathways, including FGF signaling, SHH signaling, and that are commonly found in various types of cancers [6,[21][22][23]. What is more, upregulated genes were related to cell-cell adhesion, alteration of which is widely reported in cancer [11,24]. For downregulated genes, besides carcinoma common pathways, altered genes were identified in Rap1 pathway, which regulates various cancer tightly related biological processes, including angiogenesis control, cell movements, intercellular interaction, and cell expansion [6,25]. Genes associated with CoA synthesis were also identified hypoexpressed, indicating the alteration of metabolism in CRCs [26,27]. Also, downregulated genes were classed in regulating cytoskeleton, this is correlated with the dysregulation of signaling transduction intracellularly. Additionally, downregulated genes were identified in reninangiotensin system which is an important pathway in regulating plasma sodium concentration and blood pressure, indicating that the development of CRCs may affect the homeostasis via the hormone system. Surveying alterations of these pathways may help us to understand the development of CRCs. Protein interactions are essential for the signaling transduction within cells and communicating intercellularly and environmentally. The protein interaction module analysis of DEGs revealed that, in tumor organoid samples, PPIs were identified and enriched in extracellular matrix organization, GPCR signaling pathway. GPCRs are the largest receptor in eukaryotes; they are coupled with G proteins triggering signaling cascades to regulate various physiological processes [28]. The alteration of GPCRs signaling pathway, in cancer cells, promotes the proliferation, angiogenesis, and metastasis, aids to escape from apoptosis, and sustains the survive. Moreover, PPIs were also identified to involve in other signaling pathways, including AMPK signaling, mTOR signaling, and DNA damage, which are frequently altered in CRCs [2].
In conclusion, our study shows a bioinformatic analysis of differentially expressed genes in CRCs organoids. The study supports the fact that organoid technology as a promising in vitro 3D system recapitulates the property of tumors. Our study will encourage the exploration of the application of organoid as a more convenient resource for CRC therapy studies.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Disclosure
This article does not contain any studies with human participants or animals performed by any of the authors.