Dysregulation of Pseudogenes/lncRNA-Hsa-miR-1-3p-PAICS Pathway Promotes the Development of NSCLC

Objective Non-small cell lung cancer (NSCLC) explains about 80 percent of whole lung cancers, and its 5-year survival rate is impoverished, as when people are first diagnosed, 68% of whom are identified at a dangerous stage. The molecular mechanisms of NSCLC are still being explored. Methods GSE18842 and GSE19804 were exerted to scan for diversely expressed genes (DEGs) in NSCLC, and then we used GEPIA for the validation of DEGs expression. The prognostic values were determined through Kaplan–Meier analysis. Three target prediction databases indicated potential microRNAs (miRNAs), while miRNet predicted hsa-miR-1-3p′s upstream long non-coding RNAs (lncRNAs) and pseudogenes. UALCAN was utilized to identify the co-expressed genes of PAICS, while enrichment analysis on them was managed with Enrichr. Results We initially found that the gene expression level of cyclin B1 (CCNB1), cyclin-dependent kinases1 (CDK1), and phosphoribosylaminoimidazole succinocarboxamide synthetase (PAICS) had a notable increase in NSCLC. We predicted 6, 10, and 7 microRNAs to target CCNB1, CDK1, and PAICS, respectively. Among miRNA-mRNA (microRNA-messenger RNA) pairs, we deduced that the hsa-miR-1-PAICS axis was the most potential one to inhibit the occurrence of NSCLC. We also noted that the hsa-miR-1-3p-PAICS axis participated in regulating the process of mitosis with mechanical functions. Moreover, we identified 5 pseudogenes and 33 long non-coding RNAs (lncRNAs) that might inhibit the hsa-miR-1-3p-PAICS axis in NSCLC. Conclusions The pseudogene/lncRNA-hsa-miR-1-3p-PAICS is very important in NSCLC on the basis of this study, thus providing us with effective treatments and promising biomarkers for the diagnosis of NSCLC.


Introduction
So far, there are many virulent tumors, especially NSCLC, which mainly explains 80% of cases in China. It is worth noting that NSCLC contains many subforms, such as squamous cell carcinoma and adenocarcinoma. It is reported that about 68% of patients are found at a hazardous stage with a low five-year survival rate [1]. Surgery [2], radiation [3], chemotherapy [4], biotherapy [5], immunotherapy [6], and electric field therapy [7] are the current treatment options for NSCLC. However, the therapeutic result remains poor with the usage of several treatment procedures. e primary reason is that the pathophysiology of the disease and its prognostic markers remain unclear.
ere is a non-coding RNA, whose length is beyond 200 nucleotides, called lncRNA. Numerous studies have established that lncRNA has a noteworthy function in manifold biological processes, including dosage compensation [8], epigenetic control [9], cell cycle regulation [10], and cell differentiation regulation [11], which has been a focus of genetic study. Generally, lncRNA transcripts can influence the activity of particular proteins by chemically linking to them. To regulate other RNA transcripts, competing endogenous RNAs (ceRNAs) can strive for shared miRNAs. A non-coding pseudogene can attach to and compete with the same collection of miRNAs via microRNA response elements (MREs) as a combination zone [12], affecting the distribution of miRNA molecules on all their target miRNAs. It has been proved that pseudogenes are a convincing example of ceRNA as they presumably include many of the same MREs as their ancestor genes and can operate to combine with the target miRNAs [13]. Furthermore, ceRNA may suppress the activity of some miRNAs [14], whose decreased expression may lead to overexpression of particular genes associated with NSCLC. rough a variety of analytical processes, we created a network connected with the evolution of NSCLC in this study. We are sure that this research will bring new methods to the fields of treatment and pathogenesis of NSCLC. According to Figure 1, you would have a good understanding of our research process.

e Analysis of Microarray Data and Scanning for Diversely Expressed Genes.
Aiming at comparing genome-wide gene expression of NSCLC with normal tissues, we searched the widely utilized GEO database (https://www.ncbi.nlm. nih.gov/geo/) [15]. For future studies, the GSE18842 dataset (46 tumor and 45 normal samples) and the GSE19804 dataset (60 tumor and 60 normal samples) were used. We filtered the DEGs on GSE18842 and GSE19804 microarray, respectively, using the R program limma [16] with the condition that p value is less than 0.05, log2FC is greater than 2. After that, We intersected the results to obtain the common DEGs thus drawing a Venn diagram.

e Analysis of Functional Enrichment, Interplay Network, and the Recognition of Hub Gene.
In order to further elucidate the dormant functional annotation and pathway enrichment-related with the DEGs [17], Gene Ontology (GO) analysis, was conducted using the clusterProfiler package (version: 3.18.0) [18], and p < 0.05 indicates statistically remarkable variances. e network of DEGs' protein-protein interactions (PPIs) was made through STRING (version: 11.0) [19], and the threshold score was 0.4. We deleted protein nodes that did not have a relationship with other proteins. Additionally, the PPI network was examined by Cytoscape (version 3.8.0) software to recognize key modules and hub genes (which is shown in text foot notation 8) (version 3.7.2) [20]. We also use the MCODE (version: 2.0.0) plugin to identify important clustering modules on the foundation of the following criteria: the score of MCODE >10 and node count >20, and by using the clusterProfiler software [21], the genes' pathway enrichment analysis included in these modules was conducted. Following that, we used the CytoHubba (0.1) plugin to scan for the PPI network and identified genes with a degree greater than 30 as NSCLC hub genes [22].

e Survival Analysis and Confirmation.
To thoroughly assess hub genes' prognostic relevance in NSCLC, we use the survival software (version 3.2-7) for survival analyses, with the default settings and the median as the cut-off value [23]. e sample of NSCLC was picked to be the dataset. Besides, the Cox proportional hazards and Kaplan-Meier models were exerted to compute hazard ratio (HR). p < 0.05 means the dissimilarity is statistically marked. GEPIA database (http://gepia.cancer-pku.cn/detail.php) [24], which is to examine the expression data from RNA sequencing, contains data from 483 cancers and 347 normal samples from the TCGA and GTEx projects' RNA sequencing programs. To assess the above gene survival through this database, the Group Cutoff to Quartile, the Cutoff-High to 75%, the Cutoff-Low to 25%, and the 95% Confidence Interval to NO were set. Betwixt tumor and normal samples, to study the differential expression and to carry out differential expression analysis simultaneously, we set all parameters to default values. p < 0.05 means the dissimilarity is statistically notable.

Screening for Key miRNAs.
We used Cytoscape to generate three target gene networks in this article [20]. Following that, the predictive importance of miRNA expression of hub genes that were found in NSCLC was determined through the Kaplan-Meier plotter (https://kmplot. com/) [35], a web-based database for gene expression. e data of this database contains information about lung cancer [36], ovarian cancer [37], gastric cancer [38], and breast cancer [39]. To summarize, miRNAs were initially taken as input. Based on the median expression value, the complete amount cases of NSCLC were categorized into a lower expressed classification and a higher expressed classification. en we conducted Kaplan-Meier survival charts with the use of this web page. Additionally, we generated and published the HR, 95% CI, and logrank p-value on the homepage automatically. p-value <0.05 denotes statistically notable.

ENCORI Database Analysis.
e ENCORI database (https://starbase.sysu.edu.cn) is a free platform to research the interactions of non-coding RNAs [40,41]. We exerted ENCORI to assess the bond betwixt miRNAs and genes or pseudogenes expression and R -0.1 and p-value of 0.05 were found to be the cut-off values for associated miRNA-gene/ pseudogene pairings. Additionally, the ENCORI database was applied to foretell pseudogenes and lncRNAs that possibly tie to hsa-miR-1-3p.

UALCAN Database
Analysis. UALCAN (http://ualcan. path.uab.edu) is a database to evaluate the diverse expression genes and survival effects that enables simple entry to publicly accessible cancer transcriptome data, NSCLC [42,43] included. e database work to identify PAICS coexpressed genes in NSCLC in this investigation. en, as noted previously, these co-expressed genes were crossreferenced with those obtained from the GEPIA database. e co-expressed genes that were shown frequently in both databases were reclassified as co-expressed genes and we selected them to perform further enrichment analysis.  five enriched GO terms and KEGG pathways were shown and downloaded as pictures.

Dinucleotides' Functional Enrichment Analysis, Integration of Networks which Show Proteins Interact with Each
Other, and Analysis Modular. Intending to gain a better comprehension of DEGs' biological roles, GO enrichment analysis was performed. e BP category enriched for overexpressed DEGs involved in the division of the organelles, the nucleus, and the mitotic division of the nucleus (Figure 3(a)). By contrast, the DEGs of down-regulation are abundant in controlling vasculature development, regulation of angiogenesis, and cell-substrate adhesion ( Figure 3(b)). e increased DEGs in the CC category are mostly localized in the spindle and condensed chromosome (Figure 3(a)). Down-regulated DEGs were commonly placed in the collagen-containing extracellular matrix and cell-cell junction (Figure 3(b)). Up-regulated genes are primarily concentrated in extracellular matrix structural constituents and metalloendopeptidase activity in the MF category. e down-regulated DEGs are mostly actin binding, extracellular matrix structural components, and cytokine binding ( Figure 3(b)). Furthermore, revealed by KEGG pathway analysis, what was considered highly expressed in cell cycle, Oocyte meiosis, and ECM-receptor interaction were upregulated DEGs (Figure 3(c)). In comparison, downregulated DEGs are much more frequent in Cytokinecytokine receptor interaction, as well as Cell adhesion molecules (Figure 3(d)).
Next, we constructed the PPI system through STRING and evaluated it through the Cytoscape program. We used the MCODE plugin and obtained three major clustering modules and examined the functional annotation's degree for these modules (Figure 4). e first cluster module contains 63 nodes and 1777 edges. Module 1 genes are primarily involved in progesterone-mediated oocyte maturation (Figures 4(a), 4(d)). Module 2 of the cluster consists of 43 nodes and 427 edges. Module 2 contains genes mainly involved in malaria and the interleukin 17 (IL-17) pathway (Figures 4(b), 4(e)). e third cluster module contains 55 nodes and 199 edges. In the module, the primary genes are related to extracellular matrix-receptor (ECM-receptor) interaction, transcription dysregulation in cancer, and protein digestion and absorption (Figures 4(c), 4(f )). In total, 125 DEGs were identified using the degree method in the CytoHubba plugin for further research.

Survival Analysis and Validation.
e predictive significance of 125 important genes was assessed through the r.survival program. Examination of survival data proved most genes were not related to overall survival (OS) in NSCLC patients. But Cox proportional risk suggested that EZH2, CCNB1, MMP9, SOX2, FCGR3B, IL6, COL1A1, PAICS, and CDK1 were substantially linked with the operating system in NSCLC patients (Table 1). ree genes (CCNB1, CDK1, and PAICS) have been shown to have a fairly significant effect on patients' OS rates, and the tumor and normal groups' differences is statistically significant (Figures 5(a)-5(f )). Overall, CCNB1, CDK1, and PAICS could be three critical genes that influence tumor stage development of NSCLC and they could produce a poor prognosis.

hsa-miR-1-3p-PAICS Axis is Picked out to be a Potential Pathway which is Linked to the Evolution of NSCLC.
MiRNAs chiefly functioned in negative gene regulation and are important in human biological processes, cancer initiation and progression included. As a result, we used eight prediction programs to determine the upstream miRNAs of CCNB1, CDK1, and PAICS (Table 2). Lastly, we discovered 6, 10, and 7 upstream miRNAs which may, respectively, target CCNB1, CDK1, and PAICS. To facilitate visualization, miRNA-CCNB1, miRNA-CDK1, and miRNA-PAICS subnetworks were constructed, as seen in Figures 6(a)-6(c). e predictive significance of these miRNAs in NSCLC was then determined by the TCGA database. As seen in Figure 6(d), among all predicted CCNB1 miRNAs, elevated appearance of hsa-miR-548b-5p is in connection with a favorable OS rate in NSCLC patients, whereas highly expressed hsa-miR-3130-5p is in connection with a bad OS in NSCLC patients. Higher expression of hsa-miR-6501-3p, hsa-miR-188-3p, and hsa-miR-186-3p in CDK1 was, respectively, connected to a favorable prognosis (Figure 6(e)). In PAICS, upregulation of hsa-miR-374a-5p and hsa-miR-1-3p, respectively, corresponded to favorable prognosis. Given the functional mechanism and carcinogenic potential of CCNB1, CDK1, and PAICS miRNAs, these three genes' upstream miRNAs ought to be tumor suppressive. As a result, we chose hsa-miR-548b-5p, hsa-miR-186-3p, hsa-miR-6501-3p, hsa-miR- 4 Journal of Oncology   Journal of Oncology 188-3p, hsa-miR-374a-5p, hsa-let-7c-3p, hsa-miR-374b-5p and hsa-miR-1-3p for further research of miRNA-mRNA pair expression relationships. ere is a strong negative correlation only in hsa-miR-1-3p with PAICS in NSCLC, as seen in Figures 6(g)-6(n). At the same time, we exerted GEPIA database to judge hsa-miR-1-3p′ expression difference, and the consequence was that its expression was notably decreased in patients in comparison with the normal, proving that the research of the miRNA has clinical significance ( Figure 6(o)). In summary, the most plausible pathway mediating the staging progression of NSCLC ought to be the hsa-miR-1-3p-PAICS axis.

e hsa-miR-1-3p-PAICS Axis is Related to the Regulation of Mitosis Revealed by Co-Expression and Enrichment
Analyses. Two datasets were utilized for co-expression analysis: UALCAN and GEPIA. We obtained, respectively, 1898 and 200 (the top 200 most influential genes) co-expressed genes from the two database, and Supplementary Table S2 itemized them. We discovered that 185 coexpressed PAICS genes were frequent in both databases (Figure 7(a), Table 3). ese genes were subjected to GO functional annotation and KEGG pathway enrichment analysis using the enrichment of the Enrichr database. Mitotic sister chromatid segregation and organelle fission are included in the BP class (Figure 7(b)). e CC class encompasses chromosome and centromeric regions (Figure 7(c)), whereas the MF class has motor activity and chemokine activity (Figure 7(e)). KEGG pathways that have been enriched mainly indicate the PPAR signaling pathway (Figure 7(d)). ese results indicate that by controlling the chromosome and centromeric region the hsa-miR-1-3p-PAICS axis may be implicated in mitotic sister chromatid segregation, PPAR signaling pathway, and motor activity, thus limiting the development of NSCLC.

Hsa-miR-1-3p-PAICS's Upstream Dormant Pseudogenes and lncRNAs.
Pseudogenes and lncRNAs both are significant subtypes of non-coding RNAs, whose main function is interacting with mRNA as competing endogenous RNAs by competing for common miRNAs. As a result, we used the ENCORI database to anticipate dormant pseudogenes upstream of hsa-miR-1-3p-PAICS. In Figure 8(a), 119 pseudogenes were identified. ese pseudogenes should be oncogenes in NSCLC on the foundation of the ceRNA mechanism. We exerted GEPIA to determine 119 pseudogenes' expression degrees. Finally, only five pseudogenes were substantially elevated in the part with cancer compared to normal controls: FAM91A3P (shown in Figure 8(b)), LRRC37A6P (shown in Figure 8(c)) lly, we predicted certain                   lncRNAs that would influence hsa-miR-1-3p (Figure 9, Supplementary Table S3). As shown in Figures 9(a)-9(c), 82, 44, and 92 upstream lncRNAs were, respectively, discovered in lncACTdb, miRNet, and ENCORI. Supplementary  Table S3 has detailed lncRNAs. rough the intersection of the three databases, 33 lncRNAs are constructed (Figure 9(d)). In summary, overexpression of lncRNAs/ pseudogenes results in enhanced PAICS expression and mitosis regulation, which contributes to the development of NSCLC ( Figure 10).

Discussion
NSCLC grow and divide slowly in comparison to small cell lung cancer cells and disseminate relatively late. NSCLC accounts for around 80% of all lung malignancies [48], approximately 68% of which are diagnosed at a late stage with a poor 5-year survival rate [1]. It is essential to comprehend the molecular process of NSCLC advancement to create innovative therapeutic strategies and improve patients' survival rates.
With bioinformatics technology being introduced into medical molecular biology [49], the scope of basic research can be expanded, and the prediction of important biomarkers can be more convenient and accurate. Furthermore, it's through the bioinformatics methods that comprehensive exploration and analysis of mRNA data sets [50], miRNA data sets [51], and lncRNA data sets [52] in different databases can be conducted, which eventually improves the accuracy of differentially expressed genes determination.
is study screened three genes with research values from 919 DEGs. en, the main idea of this study was to construct the regulatory axis of ceRNA, and to predict the potential miRNAs, lncRNAs, and regulated upstream of central genes through the data set. Finally, the regulatory axis hsa-miR-1-3p-PAICS is constructed. rough analysis and survival analysis, this study identified three genes (CCNB1, CDK1, and PAICS) as key genes linked with the development of NSCLC in this study. CCNB1, CDK1, and PAICS expression levels raised in NSCLC, which have been implicated in the development of many human malignancies as oncogenes. What's more, we can indicate that three genes potentially function as biomarkers for cancer from previous studies. For instance, the high-level mRNA expression of CCNB1 and CENPF can be regulated by hnRNPR, thus promoting the aggressiveness of gastric cancer [53]; Zhang et al. [54] suggested that high expression of CCNB1 in pancreatic cancer inhibits cell proliferation and promotes cell senescence through p53 pathway; Sepideh lzadi [55] found that CDK1 is an important target for breast cancer diagnosis and treatment; Huang et al. [56] indicated that the interaction between CDK1 and SOX2 promotes the dryness of the cells in lung  cancer; the study of Shuyi Zhou [57] suggested that PAICS may provide us a novel treatment for lung adenocarcinoma; Moloy Goswami et al. [58] confirmed that increased expression of PPAT and PAICS affects disease progression by regulating lung adenocarcinoma metabolism. From all the reports and our analytic results, we can draw the conclusion that CCNB1, CDK1, and PAICS may be three hub oncogenes in the development of NSCLC.
MiRNAs are non-coding RNA molecules that are involved in controlling biological activity by downregulating the expression of target genes [59]. So we intend to identify miRNAs that specifically target CCNB1, CDK1, and PAICS. Numerous miRNAs were predicted through a variety of online sources, including six for CCNB1, ten for CDK1, and seven for PAICS. e miRNAs mentioned above function as tumor suppressor miRNAs in NSCLC on the foundation of their mode of action. Following survival analysis, we picked eight sets of miRNA-mRNAs as expressions for subsequent correlation study. Connection analysis revealed a strong negative correlation only in the hsa-miR-1-3p-PAICS pair.
In conclusion, the hsa-miR-1-3p-PAICS axis is being investigated as a possible route implicated in the development of NSCLC. Numerous studies have established that hsa-miR-1-3p is a critical inhibitor of the genesis and progression of a range of human malignancies. For example, the study of Zhanrui Mao [60] showed that the low level of hsa-miR-1-3p may be a indication of CR which had a significant relationship with the disease stage according to the analysis of miRNA data in TCGA; Li et al. [61] suggested that the apoptosis and proliferation of the cells in hepatoma can be influenced by the overregulation of hsa-miR-1-3p. Afterward, we identify the co-expressed genes of PAICS. e GO analysis revealed a high enrichment of these co-expressed genes during mitosis. Consequently, by regulating the mitosis, the hsa-miR-1-3p-PAICS axis may restrict the cell division of NSCLC, thereby halting stage advancement.
Along with miRNAs, there are several additional forms of RNAs, including lncRNAs and pseudogenes. ey could affect health and illness, including cancer, by binding competitively to common miRNAs as ceRNA [62]. Using the ENCORI database, we acquired 119 upstream pseudogenes of the hsa-miR-1-3p-PAICS axis. e GEPIA database was utilized to better distinguish between NSCLC samples and normal controls, as well as between main phases. Correlation analysis of expression data showed that hsa-miR-1-3p negatively correlated with FAM91A3P, LRRC37A6P, PKMP1, RPL9P32, and BMS1P8. When the ceRNA mechanism and the findings of the preceding investigation are combined, it is verified that pseudogenes may regulate the hsa-miR-1-3p-PAICS in NSCLC. Finally, the lncACTdb, the miRNet, and the ENCORI databases were employed to determine the hsa-miR-1-3p-PAICS axis's upstream regulatory lncRNAs. 33 lncRNAs have commonly appeared in the three databases, which shows many of these lncRNAs functioned as oncogenes in different human cancers. For example, lncRNA UCA1 promotes proliferation, migration, and immune escape and suppresses apoptosis in gastric cancer by binding anti-tumor miRNAs [63]; lncRNA CYTOR promotes the resistance of tamoxifen in breast cancer cells via binding miR-125a-5p [64]; lncRNA RMRP promotes proliferation, migration, and invasion of bladder cancer via miR-206 [65]. e reports above further indicated that these lncRNAs have similarities with those 119 possible pseudogenes, may also participate in hsa-miR-1-3p-PAICS network regulation, thus involving in the development of NSCLC.
Although we constructed the hsa-miR-1-3p-PAICS axis to better understand the occurrence of NSCLC, there are some limitations in our study. Above all, this study lacks experimental verification. Further in vivo and in vitro experiments will be conducted soon to confirm the expression and function of key genes. Additionally, we should further investigate the binding affinity of the biomarkers in our study through experiments.

Conclusion
In conclusion, integrated bioinformatics investigations indicate that the hsa-miR-1-3p-PAICS axis may contribute to the evolution of NSCLC via mitosis regulation. Additionally, we discovered putative upstream pseudogenes and long non-coding RNAs of the hsa-miR-1-3p-PAICS axis. In the future, the structure of this pseudogene within the lncRNAhsa-miR-1-3p-PAICS axis may function as a marker and target for treatment.

Data Availability
e data used to support the findings of this study are included within the supplementary information files.

Conflicts of Interest
e authors declare that there are no conflicts of interest for this work.

Authors' Contributions
Yichen Song and Zhiying Wang contributed equally to this work.