Comprehensive Identification of Potential Crucial Genes and miRNA-mRNA Regulatory Networks in Papillary Thyroid Cancer

Background Thyroid cancer is the most common endocrine malignancy, with a recent global increase of 20% in age-related incidence. Ultrasonography and ultrasonography-guided fine-needle aspiration biopsy (FNAB) are the most widely used diagnostic tests for thyroid nodules; however, it is estimated that up to 25% of thyroid biopsies are cytologically inconclusive. Molecular markers can help guide patient-oriented and targeted treatment of thyroid nodules and thyroid cancer. Methods Datasets related to papillary thyroid cancer (PTC) or thyroid carcinoma (GSE129562, GSE3678, GSE54958, GSE138042, and GSE124653) were downloaded from the GEO database and analysed using the Limma package of R software. For functional enrichment analysis, the Kyoto Encyclopedia of Genes and Genomes pathway analysis and Gene Ontology were applied to differentially expressed genes (DEGs) using the Metascape website. A protein-protein interaction (PPI) network was built from the STRING database. Gene expression, protein expression, immunohistochemistry, and potential functional gene survival were analysed using the GEPIA website, the Human Protein Atlas website, and the UALCAN website. Potential target miRNAs were predicted using the miRDB and Starbase datasets. Results We found 219 upregulated and 310 downregulated DEGs, with a cut-off of p < 0.01 and ∣log FC | >1.5. The DEGs in papillary thyroid cancer were mainly enriched in extracellular structural organisation. At the intersection of the PPI network and Metascape MCODEs, the hub genes in common were identified as FN1, APOE, CLU, and SDC2. In the targeted regulation network of miRNA-mRNA, the hsa-miR-424-5p was found to synchronously modulate two hub genes. Survival analysis showed that patients with high expression of CLU and APOE had better prognosis. Conclusions CLU and APOE are involved in the molecular mechanism of papillary thyroid cancer. The hsa-miR-424-5p might have the potential to reverse the processes of papillary thyroid cancer by modulating the hub genes. These are potential targets for the treatment of patients with papillary thyroid cancer.


Introduction
Thyroid cancer is the most common malignant tumour of the endocrine system. Its incidence has risen sharply worldwide in the past few years, as age-standardised incidence rates have increased by 20%, from 2.74 to 3.3 per 100,000 [1]. In the United States, in 2020 alone, the estimated numbers of new cases and deaths from thyroid cancer are 52,890 and 2,180, respectively [2]. Intriguingly, a parallel increase in global mortality is difficult to explain in the context of earlier diagnosis and better treatment [3].
According to criteria defined by the National Cancer Institute, there are several histological types of thyroid cancer: papillary, follicular, poorly differentiated, and anaplastic. Differentiated forms of papillary and follicular thyroid cancer contribute to more than 90% of all thyroid carcinomas [4]. It is estimated that 5 to 70% of adults could be diagnosed with thyroid nodules, by various clinical tests [5,6], and the primary intention of their evaluation is to differentiate thyroid cancer from benign nodules. Most thyroid nodules can be determined by the commonly used diagnostic tests of ultrasonography and fine-needle aspiration biopsy (FNAB) [7]. However, it has been predicted that up to 25% of thyroid biopsies remain cytologically undefined, which can require diagnostic thyroid surgery [8].
Researchers have made much progress in discovering molecular mechanisms related to thyroid tumourigenesis, which can potentially be used as an adjunct in guiding clinical decisions. Somatic BRAF and RAS point mutations,   3 BioMed Research International as well as RET/PTC rearrangement, are the most recognised markers in this progression, involving the mitogenactivated protein kinase (MAPK) and PI3K/AKT signalling pathways [4]. This suggests the unprecedented possibility of more precise and effective approaches to the diagnosis and prognosis of thyroid cancer, based on the discovery of novel molecular markers.
This study used a high-throughput gene expression database to identify the potential molecular mechanisms of papillary thyroid cancer (PTC). In an attempt to elucidate the molecular mechanisms, lay a theoretical foundation, and provide well-defined therapeutic targets for the treatment of papillary thyroid cancer, we analysed differentially expressed genes (DEGs) and their regulatory relationships.

Materials and Methods
2.1. Microarray Datasets. The gene expression profiles of GSE129562, GSE3678, GSE54958, GSE138042, and GSE124653 were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/), and gene expression data for papillary thyroid cancer and normal thyroid tissue (partial paratumour normal thyroid tissue) were obtained. GSE129562, GSE3678, and GSE54958 were used for the identification of mRNA DEGs. GSE124653 and GSE138042 were used for the identification of differentially expressed miRNAs (DEMIs). The details and patient information of the datasets in the GEO are listed in Table 1. Our study procedure flowchart for searching papillary thyroid cancer target genes is shown in Figure 1.

Data
Analysis and DEG Acquisition. After preprocessing and standardisation of raw biological data, the original datasets were analysed using the Limma package of R software. The DEGs were further analysed by taking the |log FC | > 1:5 and p < 0:01 as thresholds. Volcano maps were drawn using R software. The intersection of the upregulated and downregulated genes was mapped using the Venn package (http://bioinformatics.psb.ugent.be/webtools/Venn/).

Functional Enrichment Analysis of DEGs.
Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses were enriched by Metascape [9], which provides a set of reliable, effective, and efficient tools to analyse and interpret bioinformatics studies. GO consists of three domains: molecular function (MF), cellular component (CC), and biological process (BP). DEGs were uploaded to Metascape for enrichment analysis. Terms with a p value < 0.01, a minimum count of 3, and an enrichment factor > 1:5 were collected and grouped into clusters based on their membership similarities.
2.4. Protein-Protein Interaction (PPI) Network of DEGs. The online database STRING 11.0 (http://string-db.org), a biological database and predictor of protein-protein interaction, was used to explore the PPI analysis of DEGs. An interaction with a combined score > 0:4 was considered statistically significant. Cytoscape (version 3.7.2) is a common source bioinformatics platform for the visualisation and analysis of molecular interaction networks. We also ran a PPI enrichment analysis using Metascape, with default parameters.
2.5. Hub Gene Confirmation and Analysis. The interaction coefficient between DEGs and hub genes was calculated using the plugin cytoHubba (version 0.1), and the hub genes were screened according to their degree.
Hub genes with ≥10 degrees were identified as high connectivity hub genes in the PPI network. In addition, the Metascape Molecular Complex Detection (MCODE) component was used to cluster a given network, based on topology, to find densely connected regions and identify the most densely connected networks. The cut-off criteria were the default values: degree cut-off = 2, node score cut-off = 0:2, Max depth = 100, and K-score = 2.  The GEPIA website (http://gepia .cancer-pku.cn/) can provide quick customisation, based on TCGA data. The expression of target genes was analysed through the GEPIA website, and the prognosis of target genes was validated using the Human Protein Atlas website (https://www. http://proteinatlas.org/), which contains immunohistochemistry (IHC) data, location, staining intensity, quantity, and patient information regarding the type of cancer. The results from the two websites were used to identify each target gene. p < 0:05 was considered statistically significant.

Hub
Gene-Related miRNA Prediction. MicroRNAs (miR-NAs) are small, endogenous RNA molecules consisting of 21-25 nucleotides, and their highly conserved regions can target gene expression by binding to their 3 ′ -untranslated regions (3′-UTRs). They play an important regulatory role in the aetiology of many animal and plant diseases and in pathophysiological and physiological functions. Each miRNA is supposed to be able to regulate multiple genes, via combinatorial and competitive interactions when bound to mRNA. To determine the potential interaction of miRNA-mRNA within the hub gene network, the online resources miRDB (http://mirdb.org/) and Starbase (http:// starbase.sysu.edu.cn/) were employed for miRNA target prediction, and Cytoscape 3.7.1 was used to construct the miRNA-mRNA regulatory network. GEO database for papillary thyroid carcinoma and paired, normal thyroid tissue. As shown in Figure 2, 13530, 20188, and 18837 DEGs from GSE129562, GSE3678, and GSE54958 were extracted, respectively (Figures 2(a)-2(c)). R cluster analysis software (|log FC | >1:5 and p value < 0.01 as the cut-off) found that, in diseased tissue compared with the paired normal thyroid tissue, there were 219 upregulated genes and 310 downregulated genes. The common DEGs in the three datasets were identified using Venn diagram software (Figures 2(d) and 2(e)). The results showed a total of 10 common DEGs, among which 6 were downregulated and 4 were upregulated.

DEGs, GO, and KEGG Pathway Analysis in Thyroid
Cancer. In an attempt to analyse the biological classification of DEGs, the Metascape website was used for functional and pathway enrichment analysis. BP enrichment showed that the increase in DEGs was mainly concentrated in extra-cellular structure organisation, myeloid leukocyte activation, blood vessel development, response to wounding, and regulation of cell adhesion (Figure 3(a)), while the decrease in DEGs was mainly concentrated in the detection of stimuli involved in sensory perception, detoxification, keratinisation, and oxygen transport (Figure 3(b)). GO analysis showed that the MF changes of upregulated DEGs were mainly concentrated in structural constituents of the extracellular matrix, glycosaminoglycan binding, proteoglycan binding, protease binding, and cell adhesion molecular binding (Figure 3(c)), while downregulated DEGs in MF were mainly concentrated in olfactory receptor activity and oxygen carrier activity (Figure 3(d)). Changes in the CC of upregulated DEGs were mainly enriched in the extracellular matrix, cytoplasmic vesicle lumen, specific granule, tertiary granule, and extracellular matrix components (Figure 3(e)), whereas downregulated DEGs in CC were mainly found in the following GO terms: keratin filament and collagen-containing extracellular matrix (Figure 3(f)).
KEGG pathway analysis showed that the DEGs were primarily concentrated in ECM-receptor interaction, complement and coagulation cascades, focal adhesion, cell adhesion molecules, (CAMs), transcriptional misregulation in cancer, pathways in cancer, the p53 signalling pathway, the TGFbeta signalling pathway, thyroid hormone synthesis, and the NF-kappaB signalling pathway (Figures 3(g) and 3(h)).
Additionally, Metascape online was applied to discover the hub clusters in the network with the MCODE component based on PPI enrichment analysis. In total, 15 modular MCODEs were extracted from the 529 DEGs, which included 38 hub genes and four seed genes: ADORA, HBB, EGR1, and GPR83 ( Figure 4 and Table S1). We intersected the 18 hub genes extracted by cytoHubba with 38 hub genes identified by Metascape and found four common hub genes: FN1, APOE, CLU, and SDC2.

The Biological Roles of the Target Genes in Tumours.
To study whether the target genes play a role in the oncogenesis of other tumours, we analysed their differential expression in normal tissues and in various carcinomas, using the GEPIA website. The analysis showed that CLU was upregulated in a variety of tumours, including lymphoid neoplasm diffuse  large B-cell lymphoma (DLBC), glioblastoma multiforme (GBM), kidney renal papillary cell carcinoma (KIRP), acute myeloid leukaemia (AML), brain lower grade glioma (LGG), ovarian serous cystadenocarcinoma (OV), thyroid carcinoma (THCA), and thymoma (THYM) (Figure 7(a)). This implies that, for some tumours, the underlying mechanisms of CLU in regulating their occurrence and development are identical. We confirmed that CLU was highly expressed in THCA, using the UALCAN website (http:// ualcan.path.uab.edu/) (Figure 7(b)). Differential expression of CLU was found to vary according to the cancer stage (Figure 7(c)), the patient's race (Figure 7(d)), the patient's age (Figure 7(e)), nodal metastatic status (Figure 7(f)), and histological subtype (Figure 7(g)). To investigate the potential molecular mechanisms of CLU, coexpressed genes were identified via the COXPRESdb website (http://coxpresdb .jp). This analysis showed that CLU-related genes were primarily concentrated in the proteoglycans in cancer and in glycine, serine and threonine metabolism, arginine and proline metabolism, histidine metabolism, and tyrosine metabolism ( Figure 8). We discovered that CLU had a strong coexpressive relationship with MIR6843, SCARA3, and MAOB ( Figure 8).
To investigate the potential molecular mechanisms of APOE, the genes coexpressed with APOE were identified via the COXPRESdb website (http://coxpresdb.jp), and analysis showed that APOE-related genes are primarily concentrated in cholesterol metabolism and the PPAR signalling pathway (Figure 10). We discovered that APOE has a strong coexpressive relationship with APOC1, APOC1P1, APOC2, HSD17B14, PLTP, and PAPLN ( Figure 10).
3.6. miRNA-mRNA Network Construction. The miRDB and Starbase databases were used for miRNA target prediction

Discussion
Thyroid nodules are extremely common and are often found in asymptomatic patients who are being evaluated for other conditions [6,10]. Thyroid nodules are commonly evaluated by tests involving thyroid function, ultrasound examination, and FNAB of selected nodules. However, approximately 25% 15 BioMed Research International of FNA cytology samples yield more than two types of indeterminate cytological diagnosis [11]. Moreover, the insufficient selection of thyroid nodules for biopsy can also lead to a missed diagnosis of thyroid cancer. The management of patients with indeterminate nodules is challenging, since the estimated risk of thyroid cancer is unpredictable (5-75%) [12][13][14].
Molecular cytology diagnosis has been applied to multiplatform detection of DNA, mRNA, and miRNA, which can help to identify inconclusive thyroid nodules and further improve the preoperative risk management of benign nodules with undetermined cytology [15]. Researchers have reviewed the top 12 recommended markers, which include those well-studied (MET, TFF3, SERPINA1, TIMP1, FN1, and TPO) as well as those that are relatively novel (TGFA, QPCT, CRABP1, and PROS1) [16].
To identify DEGs and target genes, we analysed the differences in gene profile between papillary thyroid cancer tissue and normal thyroid tissue. Molecular mechanisms and regulatory relationships with papillary thyroid cancer were identified, based on the high-throughput analysis of the gene expression database. Moreover, a theoretical foundation for the diagnosis of papillary thyroid cancer and accurate personal therapeutic targets are provided by our analysis. DEGs may be useful for GO analysis, and target genes can facilitate clinical studies, following consideration of their clinical relevance. In this study, 529 integrated DEGs were found in papillary thyroid cancer, using a comprehensive analysis of GEO (GSE129562, GSE3678, and GSE54958). GO (BP, CC, and MF) analysis was then performed on the 529 integrated DEGs. The DEG enrichment analysis yielded many terms concentrated in BP, CC, and MF. These results indicate that these DEGs are involved in the extracellular matrix of thyroid cancer cells. The KEGG pathway analysis showed that DEGs were primarily concentrated in ECM-receptor interaction, complement and coagulation cascades, cell adhesion molecules (CAMs), transcriptional misregulation in cancer, pathways in cancer, thyroid hormone synthesis, mineral absorption, and tyrosine metabolism. The ECM is highly dynamic since it is constantly deposited, reshaped, and degraded throughout development until maturity, to maintain tissue homoeostasis [17]. The composition and

19
BioMed Research International organisation of the ECM are spatiotemporally regulated to control cell behaviour and differentiation, and the dysregulation of ECM dynamics can lead to the development of diseases such as cancer [18]. Therefore, investigating this pathway will lead to a better understanding of the proliferation and invasion of papillary thyroid cancer and will help to predict tumour progression.
We built a PPI network with 529 integrated DEGs, and 18 hub genes were identified. Additionally, the MCODE component of Metascape online was also applied, and 38 hub genes were discovered. We then intersected the 18 hub genes and 38 hub genes identified by different methods; four hub genes (FN1, APOE, CLU, and SDC2) were found in common. They affect tumourigenesis and progression, mainly by affecting cell adhesion, migration, and apoptosis. These hub genes could be used as therapeutic targets in the management of papillary thyroid cancer and inconclusive thyroid nodules. Then, these 4 hub genes were subjected to a prognostic analysis using the GEPIA and Human Protein Atlas websites. Surprisingly, CLU and APOE expression showed strong relationships with the prognosis of patients with papillary thyroid cancer. Therefore, we chose CLU and APOE as target genes for further analysis.
The protein encoded by CLU is a secreted chaperone that may be involved in several basic biological events, such as cancer initiation and progression, and neurodegenerative disorders [19,20]. This involvement suggests that CLU might play an important role in cell death, cell cycle regulation, DNA repair, cell adhesion, tissue re-modelling, lipid trans-portation, membrane recycling, and immune system regulation [19,20]. Indeed, CLU is selectively overexpressed and strongly associated with increased tumourigenicity, metastatic potential, and resistance to chemotherapy [21][22][23][24]. It has been found that CLU mRNA and protein are overexpressed in several human cancers, including cancer of the prostate, breast, lung, kidney, ovary, colon, and endometrial tissues [25][26][27][28][29][30][31][32][33]. Overexpressions of CLU in DLBC, GBM, KIRP, LAML, LGG, OV, THCA, and THYM patients were found by using the GEPIA website, which was consistent with our main analysis. It has been reported that apoptotic triggers can upregulate CLU, which acts as a cell survival gene, and can affect cell resistance to apoptosis in carcinomas [34]. CLU was found within distinct, bipartite patches on the basolateral plasma membranes of cultured porcine thyrocytes, suggesting that it is a component of cell-adhesion complexes and is involved in cell-cell and cell-matrix interactions [35]. Our analysis showed that CLU is differentially expressed in patients according to cancer stage, race, age, nodal metastasis, and histological subtype. We also found that CLU has a strong coexpressive relationship with MIR6843, SCARA3, and MAOB.
APOE is an apolipoprotein associated with lipid particles, mainly mediating lipid transport between organs via the plasma and interstitial fluids [36][37][38]. In addition, APOE is involved in innate and adaptive immune responses, such as controlling the survival of myeloid-derived suppressor cells [39]. It has also been reported that APOE is related to a variety of biological cellular events, such as cell proliferation,

20
BioMed Research International migration, adhesion, and immunoregulation, in a tissuedependent manner [40]. Many studies have identified a link between APOE and carcinogenesis via metabolic mechanisms, including DNA synthesis, β-catenin localisation, cell proliferation, antioxidant function, angiogenesis, and metastasis [41,42]. Many reports have shown that APOE can be overexpressed in human malignancies, including cancers of the bladder, breast, and ovary, colorectal cancer, renal cell carcinoma, and gastric cancer [42][43][44][45][46][47]. It is consistent with these findings that the levels of APOE in DLBC, ESCA, GBM, HNSC, LGG, LIHC, PAAD, PRAD, SKCM, STAD, TGCT, THCA, THYM, UCEC, and UC patients were found to be overexpressed, using the GEPIA website. Although APOE mRNA is expressed in significant quantities in various normal human organs [40,48], it is almost undetectable in the thyroid and the expression of APOE protein is negative [49]. However, it has also been reported that APOE was highly expressed in THCA, via the UALCAN website. This suggests that high APOE expression may be an important characteristic of thyroid cancer. Our analysis showed that APOE is differentially expressed in THCA patients according to cancer stage, race, age, nodal metastasis, and histological subtype. We also found that APOE has a strong coexpressive relationship with APOC1, APOC1P1, APOC2, HSD17B14, PLTP, and PAPLN. It was found that both APOE and CLU could delay the initiation time of amyloid growth kinetics in a concentration-dependent manner, acting as extracellular

21
BioMed Research International chaperones to inhibit amyloid-β deposition in patients with sporadic cerebral amyloid angiopathy [50]. Additionally, studies have shown that concentrations of APOA1, APOE, and CLU were differentially expressed in cervical squamous cell carcinoma patients when compared to those with benign lesions and were associated with the histological classification or the processing of the cervical lesion [51]. Interestingly, we found that APOE correlated with CLU via LRP2 in MCODE2, and CLU and LRP2 had common predicted miR-NAs, which implicated hub genes and their potential miR-NAs that might affect thyroid cancer proliferation and invasion through the extracellular matrix. Figure 11: The construction of miRNA-mRNA network of hub genes in THCA. The red circles predicted the potential miRNAs that can be validated in GSE124653 and GSE138042. The green circular nodes represent mRNAs. The size of the nodes is equal to the target score of interaction between miRNAs and mRNA.