Interactive Verification Analysis of Multiple Sequencing Data for Identifying Potential Biomarker of Lung Adenocarcinoma

Background Lung adenocarcinoma (LUAD) comprises around 40% of all lung cancers, and in about 70% of patients, it has spread locally or systemically when first detected leading to a worse prognosis. Methods We filtered out differentially expressed genes (DEGs) based on the RNA sequencing data in the Gene Expression Omnibus database and verified and deeply analyzed screened DEGs using a combined bioinformatics approach. Results Expressions of 11,143 genes in 694 nontumor lung tissues and LUAD cases from 8 independent laboratories were analyzed; 188 mRNAs were identified as differentially expressed genes (DEGs). A PPI network constructed with 188 DEGs screened out 8 hub DEGs (CDH5, PECAM1, VWF, CLDN5, COL1A1, MMP9, SPP1, and IL6) which highly interconnected with other nodes. The expression levels of 8 hub genes in LUAD and control were assessed in the Oncomine database, and the results were consistent. The survival curves of 8 hub genes showed that their expressions are significantly related to the prognosis of lung cancer and LUAD patients except for IL6. Since the expression of IL6 is nonspecific and highly sensitive, we choose the other 7 hub genes we had verified to do the next analysis. Mutual exclusivity or cooccurrence analysis of 7 hub genes identified a tendency towards cooccurrence between CDH5, PECAM1, and VWF in LUAD. The coexpression profiles of CDH5 in LUAD were identified, and we found that PECAM1 and VWF coexpressed with CDH5. Immunohistochemistry and RT-PCR analysis showed that higher levels of CDH5, PECAM1, and VWF were expressed in normal lung tissues but a low or undetectable level was found in LUAD tissues. Conclusions Taken together, we speculate that CDH5, PECAM1, and VWF played an important role in LUAD.


Background
Lung cancers (LC) account for 13% of all cases in 2019 in the United States, and the greatest number of deaths are from LC whether in men (account for 24%) or women (account for 23%). One-quarter of all cancer deaths are due to LC which made it the leading cause of cancer-related mortality [1]. LC are mainly divided into two subtypes: nonsmall-cell lung carcinoma (NSCLC) and small-cell lung carcinoma (SCLC), accounting for 85% and 15% of all cases, respectively [2][3][4]. NSCLC can be classified into the major subtypes adenocarcinoma (AD) and squamous cell carcinoma (SCC). Lung adenocarcinoma (LUAD) is the most fre-quent histologic type of LC which comprises around 40% of all LC [5][6][7].
Patients whose LC have spread locally or systemically when first detected, constituting 70% of all patients, usually receive chemotherapy and/or radiation therapy instead of surgery [8][9][10]. Hence, local extension and metastases are also primary causes of death in LUAD patients. In the meantime and even more ominously, the recurrence rates in resected stage I NSCLC still range between 22% and 38% [11,12], and nearly half of LUAD patients suffered a relapse and would die as a result of disease recurrence [13].
LUAD patients' risk assessment and therapeutic plan determination were usually dependent on traditional risk factors including tumor size, stage, and lymph node status. However, these existing clinical methods for prognosis evaluation still have defects such as invasiveness, unsystematic, and subjective; they cannot offer help for an effective targeted therapy and even do not clearly distinguish between patients who have a high or a low risk [14]. Therefore, it is necessary to establish a more accurate method to manage this highmortality disease. It is urgent to find one or a few accurate indicators in the genesis and development of LUAD. We hope to shed light on exploring potential diagnostic and therapeutic targets in LUAD by our results of data analysis.
Although there are many studies about the mechanism of LUAD, the definite molecular cause of LUAD is still unclear. It is extremely vital and sorely demanded to reveal the pathogenesis and underlying molecular mechanisms of LUAD; it is beneficial for early diagnosis, prevention, and targeted therapy molecular biomarkers. In the present study, we want to find one or several molecular biomarkers which may eventually be applied to effective diagnosis and therapy of LUAD.
Microarray was a high-throughput platform which could measure the expression of the global gene. It was widely used for searching for possible genetic or epigenetic alternations, identifying molecular biomarkers such as for carcinomas [15,16]. Huge amounts of core slice data were produced with extensive use of microarrays, and most of them were stored and shared in public databases [17,18]. However, because of the limitations of some of these studies which included small study populations, single-center cohorts, and model overfitting, different researchers sometimes reached different conclusions. For getting more accurate reasons about onset and progression of LUAD, we integrated, reanalyzed, and verified the data stored in public databases. Some studies had been done to seek differentially expressed genes (DEGs) in LUAD though gene expression profiling microarrays [19][20][21]. However, for independent researches involving heterogeneous tissues or samples, in addition, their results were obtained from single cohort study, so their conclusions were limited or inconsistent. Consequently, key genes and pathways were difficult to confirm according to different studies. With our study, via integrating, reanalyzing, and verifying available and relevant expression profiling microarray datasets that have been uploaded in the Gene Expression Omnibus (GEO) database by different laboratories, one-sidedness of individual researches is overcome and statistical power increased; therefore, the screening results are more precise and reliable.
In the present study, we have downloaded 8 original microarray datasets, GSE32863 (58 nontumor lung tissues, 58 LUAD tissues), GSE7670 (28 nontumor lung tissues, 28 LUAD tissues), GSE40791 (100 nontumor lung tissues, 94 LUAD tissues), GSE63459 (32 nontumor lung tissues, 33 LUAD tissues), GSE75037 (83 nontumor lung tissues, 83 LUAD tissues), GSE85841 (8 nontumor lung tissues, 8 LUAD tissues), GSE116959 (11 nontumor lung tissues, 57 LUAD tissues), and GSE118370 (6 nontumor lung tissues, 6 LUAD tissues), from the GEO database (https://www.ncbi .nlm.nih.gov/geo). There were a total of 326 nontumor lung tissues and 367 LUAD tissues available. Subsequently, the DEGs were screened using R language, and 188 DEGs were filtered out from 11,143 genes based on 8 independent datasets which contained 694 cases. To better clarify the pathological mechanisms of LUAD, we performed cluster analysis, functional analysis, and biological pathway and process enrichment analysis for 188 screened DEGs. To determine hub genes with significant expression difference between normal lung and LUAD, we constructed a proteinprotein interaction (PPI) network for the 188 DEGs screened with the threshold of |log 2 FoldChange | >2:0, and 8 hub genes were screened out. They are CDH5, PECAM1, VWF, CLDN5, COL1A1, MMP9, SPP1, and IL6. To verify our screening results, the expression signatures of hub DEGs in clinical cancer tissue were assessed by several databases. Their expressions in normal lung and LUAD tissues were analyzed in the Oncomine database. The survival times of normal and LUAD patients with high or low DEG expressions were identified with the KM Plotter database. The coexpression analysis of hub DEGs which was conducted by cBioPortal reveals the cooccurrence or mutual exclusivity relationship and provided the information for the possible underlying mechanism. All in all, we hope to gain further insight of LUAD at the molecular level and explored the potential candidate biomarkers for diagnosis, prognosis, and drug targets.

Microarray Data Selection.
In the current study, the gene expression profiling datasets (ID: GSE32863, GSE7670, GSE40791, GSE63459, GSE75037, GSE85841, GSE116959, and GSE118370) were obtained from the Gene Expression Omnibus database of the National Center for Biotechnology Information (NCBI). "Lung adenocarcinoma," "Homo sapiens [organism]," and "expression profiling by array [dataset type]" were used as keywords for searching. There were 260 results under this search condition. We selected the microarray datasets according to the following rules: the samples must contain LUAD tissues and normal lung tissues, no special treatment on patients, sequenced gene expression value > 0, and sequenced genes > 10,000. Under these conditions, we obtained 8 datasets to perform further analysis. We extracted expression data of all sequenced genes from the original studies by 8 independent researchers. The following information was extracted from each screened study: GEO accession number, sample type, platform, number of normal and LUAD tissues, and gene expression data. The information of the selected GEO series is listed in Table 1. We download the raw data of 693 specimens from 8 independent GEO series. In total, 326 nontumor tissues and 367 LUAD specimens were enrolled in 8 GEO series. The process of data filing is shown in Figure 1.

Data
Preprocessing before Difference Analysis. We utilized the robust multiarray average algorithm of the Affy package in R language to convert the raw data to expression data. According to the platform annotation files, the expression levels of the probe sets were converted into gene expression levels by the Bioconductor annotation function of R. Expression values of multiple probes for a given gene were The filter criteria is |log 2 FoldChange | ð|log 2 FC | Þ > 2, also known as |FoldChange | >4 and adjusted p value < 0.05. A volcano plot, representing the distribution of the fold change and p value of all genes, was drawn. A heat map of expression hierarchical clustering analysis for 188 DEGs was performed to investigate probable discrepancies between normal lung and LUAD tissues.

Functional and Pathway Enrichment Analysis for All
DEGs. To explore the main molecular function and pathway that involved DEGs, we did functional enrichment analysis using FunRich. The FunRich software is a standalone functional enrichment and network analysis tool. It was utilized to perform cellular component, functional (molecular function and biological process) and pathway (biological pathway) enrichment analysis for the obtained DEGs with p value < 0.05 as a strict cutoff.

Protein-Protein Interaction (PPI) Network Construction
and Hub Gene Identification. The functional proteinprotein interaction (PPI) analysis is essential to interpret the molecular mechanisms of key cellular activities in carcinogenesis. It is constructed on the basis of the Search Tool for the Retrieval of Interacting Genes (STRING) database [22]. Our study constructed a PPI network for all DEGs and visualized the interaction network with the cutoff criterion of interaction score > 0:4.
Hub genes were selected with interaction degree > 15, and finally, there are 8 hub genes we selected which were highly interconnected with other nodes.

Oncomine Database Analysis and Kaplan-Meier Plotter
Analysis for DEGs. Oncomine is a cancer transcriptomic database and web-based discovery platform with genomewide expression analyses of various cancers [23,24]. The expression levels of 8 screened hub DEGs were analyzed Immunohistochemistry of CDH5, PECAM1, and GAPDH on lung and LUAD tissue was adopted from the Human Protein Atlas (http://www .proteinatlas.org).
3 BioMed Research International using the Oncomine Cancer Profiling Database (https://www .oncomine.org). We analyzed and compared the expression of 8 screened hub genes between LUAD tissues and normal lung tissues in the Oncomine database.
The Kaplan-Meier Plotter is a database that can be used to assess the effect of 54,675 genes on patient survival using 10,461 cancer samples (breast, ovarian, lung, and gastric cancer) [25]. For survival analyses, we analyzed the prognostic value of 8 screened hub DEGs in normal lung and LUAD using the Kaplan-Meier Plotter (http://kmplot.com/ analysis/) and tested for significance using logrank tests. The analysis was performed according to the manufacturer's instructions. ALL hub DEGs had a significant correlation with the overall survival of LC and LUAD patients except IL6.
Coexpression analysis in Oncomine was used to identify sets of genes with synchronous expression patterns. The coexpression profiles of CDH5 in LUAD was identified and presented as the pattern of heat map.

Quantitative Real-Time PCR Analysis and Immunohistochemistry
Total RNA in 2 paired lung and LUAD tissues was extracted using TRIzol (Invitrogen). The cDNA was reversetranscripted from 1 mg of total RNA using the reverse transcriptase kit (Toyobo). Q-PCR was performed using a 7500 Real-Time PCR System (Applied Biosystems) and SYBR Green PCR Master Mix (BioRad). GAPDH was used for normalization. Primers sequences were listed in Table 2.  BioMed Research International value of the sample, which is almost on a straight line after batch normalization, suggesting that normalized data were qualified.

Selection of DEGs and Expression Hierarchical Clustering
Analysis. We used R Limma package software to analyze which gene sets were aberrantly expressed in comparisons with the threshold of |log 2 FC | >2 and p < 0:05. The DEGs In total, 188 DEGs (44 upregulated and 144 downregulated) were obtained based on the gene expression data of 693 patients (326 normal lung and 367 LUAD specimens from 8 GEO series). We list the top 40 DEGs according to the fold change of the gene expression value in Table 3. The volcano plot (Figure 3) showed the distribution of all DEGs. The volcano plot shows the distributions of fold change [(log 2 FoldChange] (Y-axis) and p values [-log 10 (p value)] (X-axis). In Figure 4, fold change patterns of all DEGs were selected, analyzed, and displayed in a heat map to evaluate and compare differences in gene expression between normal lung and LUAD.

Function and Pathway Enrichment Analysis of all DEGs.
Cellular component enrichment analysis of all DEGs described their distribution and structure ( Figure 5(a)). About the molecular function, the DEGs significantly enriched in cell adhesion molecule activity, extracellular matrix structural constituent, metallopeptidase activity, calcium ion binding, and receptor activity ( Figure 5(b)). To better clarify the pathological mechanisms, we performed biological pathway enrichment analysis. According to the result of the pathway enrichment analysis, DEGs were mainly enriched in epithelial-to-mesenchymal transition (EMT), cell surface interactions at the vascular wall, mesenchymal-to-epithelial transition (MET), platelet adhesion to exposed collagen, and so on ( Figure 5(c)).
To further investigate the biological effects of aberrantly expressed DEGs in LUAD, the biological process enrichment analysis of 188 screened DEGs was carried out. The top 9 enriched biological processes are shown in Figure 5(d). The functions in the biological process category were enriched in cell communication, signal transduction, cell growth and/or maintenance, aldehyde metabolism, and so on.

PPI Network Construction and Hub Gene Selection.
Based on the information in the STRING protein query from public databases, we constructed the PPI network for 188 DEGs using |log 2 FoldChange | >2:0 as the screening index ( 4.6. Survival Analysis for Obtained Hub DEGs with Kaplan-Meier Plotter. According to our previous bioinformatics analyses and validation, the hub genes' expression in LUAD patients in the Oncomine database is consistent with our research results from the GEO series. To explore the association of 8 hub gene expressions with the prognosis of LUAD patients, the survival curves were drawn using the Kaplan-Meier Plotter database. As show in Figure 8, the low expressions of CDH5, PECAM1, VWF, and CLDN5 were associated with worse prognosis(p < 0:05), and the high expressions of COL1A1, MMP9, and SPP1 were associated with worse prognosis (p < 0:05). The differences were statistically significant. In other words, LUAD patients with low expression of screened upregulated hub genes had a better prognosis, and low expression of screened downregulated hub genes had worse prognosis except for the IL6 gene. As a characteristic cytokine expressed in plasma and associated with inflammation, IL6's expression is nonspecific and sensitive, so we choose the other 7 hub genes we had verified to do the next analysis.

Coexpression Analysis and Genetic Alterations of
Obtained Hub DEGs in LUAD. The OncoPrint from cBio-Portal is a concise and compact graphical summary of genomic alterations in multiple genes across a set of tumor samples. It summarized distinct genomic alterations including mutations, CNAs (amplifications and homozygous deletions), and changes in gene expression or protein abundance. Based on previous results of difference analysis, expression validation, PPI networks construction, and survival analysis, VWF, CLDN5, CDH5, COL1A1, MMP9, PECAM1, and SPP1 were hub genes highly interconnected with other DEGs. The expression of these hub genes was compared and analyzed in the GEO database and Oncomine database; their expression differences between And the relationship between their expressions and overall survival was validated; there is significant correlation between hub genes' expression and LUAD patients' survival time. We analyzed genomic alterations of the screened hub DEGs using cBioPortal and visualized gene alterations across a set of LUAD cases (Figure 9(a)). OncoPrint can also help identify trends such as mutual exclusivity or cooccurrence between genes. The mutual exclusivity and cooccurrence from cBioPortal can be exploited to identify previously unknown mechanisms that contribute to oncogenesis and cancer progression, so we used cBioPortal to explore the potential relationship between 7 hub genes. As Table 4 shows, there was a tendency towards cooccurrence between CDH5 and PECAM1 or VWF in LUAD (p < 0:05).

BioMed Research International
Coexpression analysis in Oncomine is a tool which can be used to identify sets of genes with synchronous expression patterns. The coexpression profiles of CDH5 in LUAD were identified and presented as the pattern of the heat map. We identified the coexpression profiles for CDH5 with a strong cluster of the top 20 genes across a panel of 107 LUAD tissues. The result showed that, as DEGs that we screened out from LUAD and control tissues based on the GEO database, PECAM1 and VWF coexpressed with CDH5 (Figure 9(b)).

4.8.
Immunohistochemistry and RT-PCR Analysis of CDH5, PECAM1, and VWF. According to our results, CDH5, PECAM1, and VWF were DEGs we screened with the threshold of |log 2 FC | >2:0 and p < 0:05, and they were hub genes in the PPI network; their expression in normal lung and LUAD was verified in the Oncomine database, and their expressions were significantly related to the prognosis of lung cancer and LUAD patients; moreover, there is a tendency towards cooccurrence between CDH5, PECAM1, and VWF in LUAD. We further verified the expression of CDH5, PECAM1, and VWF through immunohistochemistry and RT-PCR analysis; immunohistochemistry (IHC) data from the Human Protein Atlas (http://www.proteinatlas.org) indicated strong expression of CDH5, PECAM1, and VWF protein in lung tissues, but not in LUAD tissues (Figure 10(a)). The mRNA levels of CDH5, PECAM1, and VWF were noticeably decreased in LUAD tissues compared to paired lung tissues (Figure 10(b)).

Conclusion
Our study utilized analysis of whole genome sequencing results from different laboratories and screened out DEGs from 5 different sequencing platforms containing 8 original microarray datasets and 694 cases. There were 44 upregulated DEGs and 144 downregulated DEGs in LUAD with the threshold of | log 2 FC| > 2:0 and p < 0:05. Biological process analysis, biological pathway analysis, and PPI network analyses provided a set of related genes and pathways to help elucidate the molecular mechanisms of LUAD. Validation experiments verified that the expression levels of DEGs in the Oncomine database are consistent with their expression levels in the GEO series. The survival curves of hub genes showed that the expressions of hub genes were significantly related to the prognosis of LUAD patients (p < 0:05) except for IL6. At this point, we believe CDH5, PECAM1, VWF, CLDN5, COL1A1, MMP9, and SPP1 play a vital role in LUAD. Mutual exclusivity or cooccurrence analysis of screened 7 hub genes showed that there was a tendency towards cooccurrence between CDH5, PECAM1, and VWF in LUAD (p < 0:05). Then, the coexpression profiles for CDH5 obtained based on Oncomine showed that PECAM1 and VWF coexpressed with CDH5 in LUAD, and they were also DEGs that were 9 BioMed Research International screened out from LUAD based on our previous results. Immunohistochemistry and RT-PCR analysis showed that higher levels of CDH5, PECAM1, and VWF were expressed in normal lung tissues but a low or undetectable level was found in LUAD tissues. From all above results, we speculate that CDH5, PECAM1, and VWF play an important role in LUAD. Though analyzed all GSE series-compared normal lung and LUAD tissues in the GEO database; the prediction is more accurate and bias of individual studies can be overcome. Our study provides information for researchers to identify possible candidate genes and pathways which may be involved in LUAD for further studies.

Discussion
Worldwide, approximately 2,093,800 patients are diagnosed with lung cancer each year, and 1,761,000 are expected to succumb to the disease in 2018. Statistically, in both sexes combined, lung cancer is the most commonly diagnosed cancer (11.6% of the total cases) and the leading cause of cancer death (18.4% of the total cancer deaths) [27,28]. LC is the most frequent cancer and the leading cause of cancer death among males and women in 2019 in the United States [1]. There are 2 main forms of LC: NSCLC (85% of patients) and small-cell lung cancer (SCLC) (15%). Adenocarcinoma is the most common type of NSCLC and accounts for

11
BioMed Research International approximately 40% of lung cancers [29][30][31]. The most common diagnostic mean for LC is fiberoptic bronchoscopy; it can help to diagnose NSCLC, but quite often, the amount of obtained material is not sufficient to subclassify NSCLC in more detail or for targeted therapies [32]. The vast majority of LC patients are diagnosed until advanced-stage LC, so that they had a worse prognosis and a high risk of distant recurrence and death [33]. We know little about the target for early detection of LUAD. Consequently, there is an urgent need for diagnostic molecular features or biomarkers that can be associated with survival and disease recurrence in LUAD.
A field which has recently contributed significantly to improve diagnostics, classification, and prognostics is the LUAD transcriptomics microarray, a whole transcriptome high-throughput sequencing and analysis technique which identifies changes in the mRNA expression, and is now being used to gain a more detailed understanding of the molecular mechanism of LUAD [34,35]. Employing analysis of whole transcriptome sequencing results from different laboratories, statistical power increased and prediction is more accurate; moreover, bias of individual studies can be overcoming. In the current study, we focused on the aberrantly expressed mRNAs in LUAD based on GEO RNA-seq data, and the common DEGs that were screened out from different researchers containing 693 samples were listed. There were 44 upregulated DEGs and 144 downregulated DEGs in LUAD with the threshold of |log 2 FC | >2 and p < 0:05.
Biological pathway analysis of all DEGs showed that the DEGs were mainly involved in epithelial-to-mesenchymal transition (EMT), cell surface interactions at the vascular wall, mesenchymal-to-epithelial transition (MET), platelet adhesion to exposed collagen, and glypican pathway. In the past decades, an increased number of studies have shown that EMT is associated with poor prognosis in different tumor types including NSCLC [36,37]. EMT, as well as its reverse process, MET, is thought to be involved in the pathogenesis of numerous lung diseases ranging from developmental disorders and fibrotic tissue remodelling to lung cancer [38]. Kakolyris et al. have shown previously in NSCLC an association between high mitogenic/angiogenic factor expression with high angiogenesis and poor prognosis [39]. Glypican-3 (GPC3) is a membrane-bound proteoglycan, belonging to the glypican-related integral membrane proteoglycan family, which includes six members (GPC1-GPC6). It has been identified as a potential biomarker candidate in lung carcinoma, severe pneumonia, and acute respiratory distress syndrome (ARDS) [40]. Glypican-5 (GPC5) was a novel tumor metastasis suppressor in LUAD through suppresses EMT [41]. Function analysis can help us better understand the mechanism of LUAD and provide guidance for LUAD prevention and treatment; however, further laboratory and clinical researches are required.
The PPI network of 188 DEGs which were screened from 693 LUAD and control tissues using | log 2 FoldChange| > 2:0 as the screening index helped us find 8 hub DEGs which had the most functional connections: CDH5, PECAM1, VWF, CLDN5, COL1A1, MMP9, SPP1, and IL6. The 8 hub genes interact with a protein number at least >15. To verify our previous results in this paper, we assessed the expression   Figure 8: Prognostic value of 7 remaining hub genes except IL6 in lung cancer and LUAD. Data were obtained from the Kaplan-Meier Plotter database. The p value was calculated by a logrank test. 13 BioMed Research International candidate biomarkers' characteristics of relative stability, further analysis was performed with the remaining 7 hub genes except IL6.
We draw survival curves of the screened hub genes and found that the prognoses of LC/LUAD patients were statistically significant with hub genes' expression (p < 0:01). Onco-Print helped us identify trends such as mutual exclusivity or cooccurrence of screened hub genes. We found that there was a tendency towards cooccurrence between CDH5, PECAM1, and VWF in LUAD (p < 0:05). Then, coexpression analysis with the Oncomine database for CDH5 found that CDH5 coexpressed with PECAM1 and VWF in LUAD, and they were also DEGs that were screened out from LUAD based on our previous results. Our results seem to show that CDH5, PECAM1, and VWF play a vital role in LUAD. CDH5 encodes Cadherin-5, which is localized at intercellular junctions of endothelial cells and plays an important role in the control of vascular integrity and permeability, and contributes to endothelial cell assembly in tubular structure [42]. Many studies had reported that CDH5 expression is associated with multiple tumors [43,44], such as gastric cancer and breast cancer, but the relationship between CDH5  14 BioMed Research International The correlation analysis of 7 hub genes showed that the relationships of gene pairs that were statistically significant (p < 0:05) all had a tendency towards cooccurrence. Log odds ratio > 0: association towards cooccurrence. Log odds ratio ≤ 0: association towards mutual exclusivity. p value < 0.05: significant association. p value: derived from Fisher Exact Test. Log odds ratio: quantifies how strongly the presence or absence of alterations in gene A is associated with the presence or absence of alterations in gene B in the selected tumors. and LUAD is still to be determined. PECAM1 is a multifunctional cell adhesion molecule involved in numerous physiologic processes within the vasculature; Abraham et al. found that the activity of PECAM1 appears to be associated with the tumor microenvironment and tumor cell proliferation [45]; Kuang et al. demonstrated that PECAM1 could be a potential prognostic factor and therapeutic target in NSCLC [46]. The von Willebrand factor (VWF) is a multimeric glycoprotein and plays an essential role in mediating platelettumor cell interactions [47]. The relationship between VWF and LUAD is still underway. From all the above results, we speculate that CDH5, PECAM1, and VWF play an important role in LUAD. This study had several limitations. Firstly, the expression of screened downregulated hub genes in LUAD patients in the Oncomine database was not statistically significant (p > 0:05), but based on the figure, the trends of hub genes' expression were consistent with the GEO database; statistical nonsense may be because of insufficient samples. Second, even though we performed preliminary validation of the results, more in-depth studies are needed in the future. Therefore, we hope that these results can be integrated into future experiments and facilitate further understanding of the molecular mechanisms of LUAD.
Despite these limitations, we believe that this analysis represents a valuable resource and can be considered as a preliminary study for future studies of LUAD. Our study provides information for researchers to identify possible candidate genes and pathways which may be involved in LUAD for further studies. We gained further insight of LUAD carcinogenesis at the molecular level and explored the potential candidate biomarkers for diagnosis, prognosis, and drug targets.

Data Availability
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article. The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.