Identification of Functional Genes in Pterygium Based on Bioinformatics Analysis

Purpose The competing endogenous RNA (ceRNA) network regulatory has been investigated in the occurrence and development of many diseases. This research aimed at identifying the key RNAs of ceRNA network in pterygium and exploring the underlying molecular mechanism. Methods Differentially expressed long noncoding RNAs (lncRNAs), microRNAs (miRNAs), and mRNAs were obtained from the Gene Expression Omnibus (GEO) database and analyzed with the R programming language. LncRNA and miRNA expressions were extracted and pooled by the GEO database and compared with those in published literature. The lncRNA-miRNA-mRNA network was constructed of selected lncRNAs, miRNAs, and mRNAs. Metascape was used to perform Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses on mRNAs of the ceRNA network and to perform Protein-Protein Interaction (PPI) Network analysis on the String website to find candidate hub genes. The Comparative Toxicogenomic Database (CTD) was used to find hub genes closely related to pterygium. The differential expressions of hub genes were verified using the reverse transcription-real-time fluorescent quantitative PCR (RT-qPCR). Result There were 8 lncRNAs, 12 miRNAs, and 94 mRNAs filtered to construct the primary ceRNA network. A key lncRNA LIN00472 ranking the top 1 node degree was selected to reconstruct the LIN00472 network. The GO and KEGG pathway enrichment showed the mRNAs in ceRNA networks mainly involved in homophilic cell adhesion via plasma membrane adhesion molecules, developmental growth, regulation of neuron projection development, cell maturation, synapse assembly, central nervous system neuron differentiation, and PID FOXM1 PATHWAY. According to the Protein-Protein Interaction Network (PPI) analysis on mRNAs in LINC00472 network, 10 candidate hub genes were identified according to node degree ranking. Using the CTD database, we identified 8 hub genes closely related to pterygium; RT-qPCR verified 6 of them were highly expressed in pterygium. Conclusion Our research found LINC00472 might regulate 8 hub miRNAs (miR-29b-3p, miR-183-5p, miR-138-5p, miR-211-5p, miR-221-3p, miR-218-5p, miR-642a-5p, miR-5000-3p) and 6 hub genes (CDH2, MYC, CCNB1, RELN, ERBB4, RB1) in the ceRNA network through mainly PID FOXM1 PATHWAY and play an important role in the development of pterygium.


Introduction
Pterygium is a common ocular surface disease characterized by a triangular-shaped growth consisting of fibrotic subconjunctival connective tissue and hypertrophy of the overlying conjunctival epithelium [1]. The population affected by pterygium reached 200 million globally, and the prevalence in China was 108.65 million [2,3]. Pterygium has long been considered as a chronic degenerative condition; however, after abnormal expression of the p53 protein was found in the epithelium [4], pterygium is now considered ultravioletrelated uncontrolled cell proliferation, similar to tumors [5]. The development of pterygium is a complicated process involving cell proliferation, migration, inflammatory infiltrates, fibrosis, angiogenesis, and extracellular matrix breakdown [2]. However, the mechanism of pterygium formation and development is still not completely understood. Studies have demonstrated that noncoding RNAs served crucial roles in numerous diseases [6][7][8]. They are classified into two classes: small noncoding RNAs (micro-RNAs (miRNAs), small interfering RNAs, and transfer RNAs) and long noncoding RNAs (lncRNAs). Salmena et al. proposed the competing endogenous RNA (ceRNA) hypothesis, wherein lncRNAs harboring miRNA response elements competed with one another to bind to a common miRNA and thereby acted as molecular "sponges" and depressed the target genes of the miRNAs [9]. The ceRNA network has been demonstrated in numerous diseases, particularly in cancer. By establishing the lncRNA-miRNA-mRNA network, Wang et al. identified functional genes in heart failure and provided further insights into the important roles of the ceRNA network in heart failure [10]. Yang et al. discovered that lncRNA RP11-169F17.1 and RP11-669N7.2 could become novel biomarkers of stomach adenocarcinoma assessed by the construction of the ceRNA network [11]. Zhu et al. reported that 4 lncRNAs might act as potential therapeutic targets or candidate prognostic biomarkers in clear cell kidney carcinoma by reconstruction and comprehensive analysis of the ceRNA regulatory network [12].
However, there is only 1 research on constructing ceRNA networks for pterygium [13]. The hub genes of the network and their roles in the development of pterygium have not been fully understood. Therefore, we collected the lncRNA, miRNA, and mRNA datasets and published literature; filtered out differentially expressed long noncoding RNAs (DELs), microRNAs (DEMis), and mRNAs (DEMs) to constructed ceRNA networks; and identified hub genes closely related to pterygium, based on the functional and pathway analysis on mRNAs in the ceRNA network.

Data Collection. The Gene Expression Omnibus (GEO)
is an international, public functional genomics data repository for high-throughput microarray and next-generation sequences [14]. The published pterygium gene expression profiles (GSE83627, GSE21346, GSE51995, and GSE2513) were downloaded [15][16][17][18][19]. The GSE83627 lncRNA data set was collected using GPL14550 platforms (Agilent-028004 SurePrint G3 Human GE 8 × 60 K Microarray, Agilent Technologies, Inc.) and included samples from 4 pterygium and 4 healthy conjunctiva tissues. The miRNA expression data of GSE21346 were based on the GPL7723 platforms (miRCUR-YLNA microRNA Array, v.11.0-hsa, mmu & rno, Exiqon A/S) and consisted of samples from 3 pterygium and 3 healthy conjunctiva tissues. The mRNA expression data of GSE51995 were from 4 pterygium samples and 4 healthy conjunctiva tissues and were based on GPL14550 platforms (Agilent-028004 SurePrint G3 Human GE 8 × 60 K Microar-ray, Agilent Technologies, Inc.). Another mRNA expression data of GSE2513 were from 8 pterygium samples and 4 healthy conjunctiva tissues and were based on GPL96 platforms (Affymetrix Human Genome U133A Array, Affymetrix). In order to make the datasets more abundant, we executed a systemic search in PubMed, CNKI, and Web of Science and found 2 microarray studies on DELs and 7 studies on DEMis, including 4 microarray assays studies and 3 experimental verification studies.

Data
Preprocessing. All downloaded data sets were standardized through the limma package of R 3.6.3, and the standardized expression matrix was used for the differential expression analysis. The DELs, DEMis, and DEMs between pterygium samples and healthy control conjunctiva tissues were selected according to |log 2 fold change ðFCÞ | ≥1 and P values <0.05 [20]. Sequentially, DELs and DEMis were pooled based on the microarray data in the searched studies, respectively, with the Draw Venn Diagram website (http:// bioinformatics.psb.ugent.be/webtools/Venn/). And the pooled results were used as candidate DELs and DEMis for the following analysis. The volcano plots and the heatmaps of DELs, DEMis, and DEMs expressions were set up using the ggpolt 2 package and pheatmap package in R 3.6.3 separately.

Construction of the lncRNA-miRNA-mRNA Network.
Based on the associations between lncRNAs-miRNAs and miRNAs-mRNAs, the ceRNA network was constructed. First, the upregulated and downregulated RNAs (lncRNAs, miRNAs, and mRNAs) were assigned |log 2FC | >1 with P values <0.05; then, the lncRNA-miRNA-mRNA network was visualized by using the Cytoscape 3.7.2 software [23], and all node degrees of the ceRNA network were calculated using the CytoHubba plugin [24] of Cytoscape 3.7.2. The significant lncRNA was selected according to the node degree ranking, and a new ceRNA network was reconstructed centered on the top lncRNA.

Functional Analysis on mRNAs in ceRNA Network for
Discovering Hub Genes. To analyze the function of mRNAs in ceRNA networks, GO and KEGG analyses were performed by Metascape (http://metascape.org/) [25]. The proteinprotein interaction (PPI) network of mRNAs in the ceRNA network was constructed by the STRING database with the 2 BioMed Research International confidence score > 0:4 as the cutoff level, providing the critical assessment and integration of protein-protein interactions of genes, including direct (physical) as well as indirect (function) associations [26,27]. Then, we used the Cyto-Hubba plugin of Cytoscape 3.7.2 to analyze the PPI network and to choose the candidate hub genes with top node degrees. Sequentially, we identified hub genes closely related to pterygium using the Comparative Toxicogenomic Database (CTD), which provides interactions between diseases-genes and diseases-drugs.

BioMed Research International
3.2. miRNA Predicted lncRNA and mRNA Targets. Firstly, the associations between the DELs and DEMis were assessed to reveal the 12 miRNAs targeting 8 lncRNAs. Subsequently, 94 mRNAs targeted by 12 miRNAs were identified. Finally, the 8 coexpressed lncRNAs, 12 coexpressed miRNAs, and 94 coexpressed mRNAs were selected to construct the primary ceRNA network (Table 1).

Functional Analysis of mRNAs Related to Pterygium.
The 94 mRNAs in the primary ceRNA network was investigated by the GO and KEGG pathway enrichment analysis with Metascape. Metascape results were dominated by functional categories, including regulation of homophilic cell adhesion via plasma membrane adhesion molecules, developmental growth, regulation of neuron projection development, adherens junction interactions, cell maturation, synapse assembly, central nervous system neuron differentiation, positive regulation of mRNA metabolic process, negative regulation of cell-substrate adhesion, tissue morphogenesis, and blood vessel morphogenesis. Two pathways were enriched, PID FOXM1 PATHWAY and PID ATF2 PATHWAY. The occurrence of pterygium is closely related to the epithelialmesenchymal transitions (EMT) and angiogenesis under the tissue [47]. FOXM1 is a member of the forkhead box (FOX) transcription factor family. Previous reports have demonstrated that FOXM1 is oncogenic, in breast cancer [48], hepatocellular carcinoma [49], prostate cancer [50], lung cancer [51], and colorectal cancer [52], and plays important roles in angiogenesis, invasion, and metastasis [53]. It may indicate the mRNAs in the ceRNA network are the biogenesis of the transformation of conjunctival cells into fibroblasts and the formation of subconjunctival neovascularization according to these biological processes and pathways, especially by regulating cell maturation, tissue morphogenesis, blood vessel morphogenesis, and PID FOXM1 PATHWAY (Figure 3(c)).
In order to explore the role of LINC00472 in pterygium, we performed GO and KEGG enrichment analysis again on the 84 mRNAs in the LINC00472 ceRNA network. We found 6 BioMed Research International that these genes were enriched in the same biology as the primary ceRNA network, but also in the cell part morphogenesis, regulation of synaptic transmission, glutamatergic, cellular response to lipid, negative regulation of cell differentiation, nervous system development, temperature homeostasis, NABA CORE MATRISOME, regulation of behavior, and response to growth factor. In addition, the mRNAs in both networks are enriched in PID FOXM1 PATHWAY, indicating that LINC00472 is likely to regulate the growth of pterygium through PID FOXM1 PATH-WAY (Figure 4(b)).

PPI Network Construction and Analysis of Hub Genes.
The PPI network of mRNAs in the LINC00472 network 7 BioMed Research International was obtained by using the STRING database, including 36 nodes and 48 edges, and the nodes contain 24 downregulated genes and 12 upregulated genes ( Figure 5(a)). The top 10 genes evaluated by connectivity degree in the PPI network were present in CytoHubba, and the result showed that CDH2 was the most significant genes with a connectivity degree of 10, followed by MYC, CCNB1, PCDHA10, RELN, CCNA2, ERBB4, CDH11, PCDHA, and RB1 ( Figure 5(b)). We defined the 10 genes as candidate hub genes in the PPI network. It means LINC00472 may regulate pterygium through these 10 key genes.
3.6. Identified Hub Genes Closely Related to Pterygium. We searched for genes related to pterygium in the CTD and found 6163 mRNAs, these genes are associated with pterygium or its descendants. A gene has either a curated association to the disease (M marker/mechanism and/or T therapeutic) or an inferred association via a curated chemical interaction. Then, we found 8 overlapped genes in the database, CCNB1, CDH11, MYC, CCNA2, ERBB4, RELN, RB1, and CDH2 ( Figure 6), as selected hub genes. It implied that the 8 genes in the LINC00472 network may play a key role in pterygium.

Hub Genes Were Verified.
Through RT-qPCR, we used 20 groups of pterygium and control conjunctival tissues to verify the above 8 hub genes. There are 6 genes that are highly expressed in pterygium, MYC, CDH2, CCNB1, RELN, RB1, and ERBB4. Compared with the control conjunctiva, only the expression of CDH11 and CCNA2 showed no significant differences with pterygium ( Figure 7). The results indicate that the prediction of the hub genes is reasonably reliable and could be used to discover more hub genes in pterygium.

Discussion
Pterygium is a common disease in ophthalmology, and its occurrence is related to the proliferation of subconjunctival fibrous tissue and the formation of new blood vessels [15]. The treatment of pterygium is mainly surgical treatment, but the high postoperative recurrence rate is still a problem that is currently difficult to solve [54]. Therefore, effective drug therapy combined with the operation to reduce the recurrence rate of pterygium has become a research hot spot. Recently, with the development of high-throughput genomic platforms and bioinformatic analysis, more and more noncoding RNAs, especially miR-NAs and lncRNAs, have been discovered. They serve an important role in the regulation of multiple biological processes, including the development of pterygium [55,56].
In this study, we mined pterygium-associated RNA expression datasets from the GEO to construct the ceRNA network. Through the analysis of the 4 data sets and 9 publications about lncRNAs and miRNAs, we obtained 251 DELs, 46 DEMis, and 897 DEMs. By the link of miRNA, the primary ceRNA network of pterygium was constructed with 8 lncRNA nodes, 12 miRNA nodes, 94 mRNA nodes, and 136 edges. Since LINC00472 has the highest node degree value of all lncRNAs in the ceRNA network (Figure 3(b)), we took LINC00472 as the center and built the LINC00472 network. And we found genes in the primary ceRNA network mainly enriched in homophilic cell adhesion via plasma membrane adhesion molecules, developmental  BioMed Research International growth, regulation of neuron projection development, adherens junctions interactions, cell maturation, synapse assembly, central nervous system neuron differentiation, positive regulation of mRNA metabolic process, negative regulation of cell-substrate adhesion, tissue morphogenesis, blood vessel morphogenesis, and PID FOXM1 PATHWAY, while genes in the LINC00472 network enriched in homophilic cell adhesion via plasma membrane adhesion molecules, developmental growth, cell part morphogenesis, regulation of synaptic transmission, glutamatergic, cellular response to lipid, negative regulation of cell differentiation, nervous system development, and temperature homeostasis. In addition, the genes in both networks enriched in PID FOXM1 PAYH-WAY, which represents cell proliferation [57], angiogenesis, invasion, and metastasis [53]. This indicated that LINC00472 might play an important role in pterygium through FOXM1 PATHWAY. After the PPI network analysis on mRNAs in the LINC00472 network was visualized by Cytoscape 3.7.2, 10 candidate hub genes were found out. We searched the 10 hub genes in the CTD; 8 hub genes were identified to be involved in pterygium. They are CCNB1, CDH11, MYC, CCNA2, ERBB4, RELN, RB1, and CDH2, and 6 of them were confirmed highly expressed in pterygium by wet experiments (CCNB1, MYC, ERBB4, RELN, RB1, and CDH2). We presume that the LINC00472 network might function in pterygium via the FOXM1 PATHWAY, especially through the regulation of CCNB1, MYC, ERBB4, RELN, RB1, and CDH2 in the ceRNA network. And the data mining based on the public data and publications in the present study is useful and reliable for digging emerging therapy targets of pterygium.

Conclusion
The present study increased the understanding of the ceRNA-associated regulatory mechanism in the pterygium and identified LINC00472 as a key lncRNA in the whole ceRNA network, which might be highly related to pterygium via the FOXM1 PATHWAY in the LINC00472 network, especially by regulating its related 6 hub genes (CCNB1, MYC, ERBB4, RELN, RB1, and CDH2).