Mining Database to Identify Aging-Related Molecular Subtype and Prognostic Signature in Lung Adenocarcinoma

Background Lung cancer is emerging as one of most deadly diseases, and the mortality rate was still high with 5-year overall survival rate less than 20%. Aging is referred as protumorigenic state, and it plays a significant role in cancer development. Methods Molecular subtype of lung cancer was identified by consensus cluster analysis. Prognostic signature was constructed using LASSO cox regression analysis. CeRNA network was constructed to explore lncRNA-miRNA-mRNA regulatory axis. Results A total of 27 differentially expressed aging-related genes (ARGs) were obtained in LUAD. Three clusters of TCGA-LUAD patients with significant difference in prognosis, immune infiltration, chemotherapy, and targeted therapy were identified. We also developed an aging-related prognostic signature that had a better performance in predicting the1-year, 3-year, and 5-year overall survival of LUAD. Further analysis suggested a significant correlation between prognostic signature gene expression and clinical stage, immune infiltration, tumor mutation burden, microsatellite instability, and drug sensitivity. We also identified the lncRNA UCA1/miR-143-3p/CDK1 regulatory axis in LUAD. Conclusion Our study identified three clusters of TCGA-LUAD patients with significant difference in prognosis, immune infiltration, chemotherapy, and targeted therapy. We also developed an aging-related prognostic signature that had a good performance in the prognosis of LUAD.


Introduction
Lung cancer is emerging as one of most deadly diseases with an estimated 2.09 million new cases and 1.76 million deaths per year globally [1]. Lung adenocarcinoma (LUAD) is the main type of lung cancer. Despite great advance had been achieved in the diagnosis and therapy of lung cancer, the mortality rate was still high, and 5-year overall survival rate is less than 20% [2,3]. Apart from the TNM staging system, there are still no ideal biomarkers or signatures to predict the prognosis of lung cancer patients. Increasing evidences revealed that molecularly-defined subtypes could provide novel strategies for the therapy and prognosis of lung cancer [4]. Thus, it is significance to develop effective prognostic signature and molecular subtype for lung cancer.
Aging is a complex process associated with various molecular and cellular mechanisms [5]. Aging facilitates a series of degenerative pathologies with characteristics of debilitating losses of tissue or cellular function [6]. The incidence of malignancy increases as the age increases [7]. Moreover, aging is referred as protumorigenic state, and it plays a significant role in cancer development [8]. Moreover, aging-related signature could serve as prognostic biomarker for many types of cancers, including colorectal cancer [9], ovarian cancer [10], and glioma [11]. However, the significance role of aging-related genes (ARGs) in LUAD had not been clarified.
In the current study, we conducted consensus clustering for differently expressed ARGs in the Cancer Genome Atlas (TCGA) database. This was followed by the correlation analysis between molecular subtype and drug sensitivity as well as immune infiltration. We then developed a prognostic risk model based on prognostic ARGs. Moreover, we also clarified the potential molecular mechanism by constructing a ceRNA network. Our result may reveal the potential implication of ARGs as marker for the prognosis and therapy of LUAD patients.

Data Acquisition and Preprocessing.
Human ARGs were isolated from HAGR on March 1, 2022 (n = 307, http:// genomics.senescence.info/genes/, Supplementary Table 1) [12]. Gene expression profile of LUAD was isolated from the TCGA database (https://portal.gdc.cancer.gov/), using the limma package in the R software to study the differentially expressed genes (DEGs) with adjusted P < 0:05 and fold change > 2 as the threshold value. The differentially expressed ARGs were shown with Venn diagram. The somatic mutation data of LUAD was downloaded from UCSC Xena (https://xena.ucsc.edu/), and the result was shown with "maftools" package. Copy number variation (CNV) oncoplot of differentially expressed ARGs in LUAD was drawn with GISTIC2.0 [13].

Consensus Cluster Analysis.
Based on differentially expressed ARGs, we then identified the optimum number of clusters of LUAD with "ConsensusClusterPlus" package. The survival curve of each cluster in LUAD was drawn using "survival" package. Moreover, immuneeconv algorithm were used to evaluate the immune score. The immune cell abundance and immune-checkpoint-related gene expression in each cluster was evaluated with Student t-test with "ggplot 2" package. Moreover, we also used "pRRophetic" R package to calculate the chemotherapeutic response in each cluster.

Development of Aging-Related Prognostic Signature
Analysis. Univariate cox proportional hazard regression analysis was conducted to explore aging-related prognostic genes (P < 0:05). This was followed by the development of agingrelated prognostic signature by using the least absolute shrinkage and selection operator (LASSO) regression algorithm in "glmnet" package. The risk score of LUAD cases was established as follows: risk score = ∑i = 1nCoef ðiÞ × xðiÞ. The nearest neighbor estimation (NNE) method was utilized to evaluate the 3-year survival and 5-year survival of LUAD. ROC curve was drawn with "survivalROC" R package. Using "ggDCA" package, we also draw a decision curve analysis (DCA) to evaluate the prediction ability of this signature. Further, Pearson correlation analysis was conducted to analyze the correlation between risk score and immune infiltration.

Risk Module Gene Analysis.
Kruskal-Wallis test was performed to evaluate the differences of the risk module gene expression in different stages of LUAD patients. After we obtained the TMB/MSI score of LUAD patients from the TCGA database, we then analyzed their correlation with the risk module gene expression with Spearman's method. Pearson correlation analysis was performed to get the correlation between risk module gene expression and drug IC50 of Genomics of Therapeutics Response Portal (CTRP) and immune cell abundance of TIMER (https://cistrome.shinyapps.io/timer/). After we downloaded the cell line mRNA expression profile from the CCLE dataset (https://portals.broadinstitute.org/ccle), we then explored the risk module gene expression in different types of LUAD cell with "ggplot 2" package. The miRNA targets were identified by using miRDB (http://mirdb.org/), StarBase (http://starbase.sysu.edu.cn/), and miRWalk (http://mirwalk .umm.uni-heidelberg.de/). And the lncRNA targets were identified by LncBase (http://carolina.imis.athena-innovation.gr/) and StarBase (http://starbase.sysu.edu.cn/).

Identification of ARG Expression and Their Mutation
Landscape of in LUAD. Figures 1(a) and 1(b) show the DEGs in LUAD. And a total of 1091 DEGs were identified. Among these DEGs, 27 was differentially expressed ARGs (Figure 1(c)). We then explored the genetic mutation of these 27 differentially expressed ARGs in LUAD, and the results were shown in Figures 1(d)-1(f). The data revealed that top 3 genes with the highest mutation frequency in LUAD were LEPR, A2M, and CDKN2A (Figures 1(d) and 1(e)). CNV analysis demonstrated a significant homozygous deletion of KL, BUB1B, and LMNB1 in LUAD (Figure 1(f)). Moreover, most differentially expressed ARGs had a homozygous amplification in LUAD (Figure 1(f)).

Consensus Clustering of ARGs in Three Clusters in LUAD.
Consensus clustering analysis was utilized to distinguish LUAD patients based on 27 differently expressed ARGs. Interestingly, these differently expressed ARGs could separate TCGA-LUAD patients into three categories according to CDF values and delta area (Figures 2(a)-2(d) ). Among these categories, LUAD patients in cluster 2 had a worse prognosis while LUAD patients in cluster 3 had a best prognosis ( Figure 2(e), p = 0:00018). Considering the significant role of chemotherapy and targeted therapy in LUAD, we then evaluate the response of this three clusters to some common chemotherapeutic drugs and targeted therapeutics. The data suggested that LUAD patients in cluster 3 could be more resistant to commonly chemotherapy and targeted therapy, including paclitaxel, gemcitabine, cisplatin, and gefitinib (Figures 2(f)-2(i), all P < 0:05). Increasing evidences suggested immunotherapy as the most promising therapeutic strategy for LUAD patients in advance stage [14,15]. In the current study, the data demonstrated an immune checkpoint expression (Figure 3(a), all P < 0:05), immune score ( Figure 3(b), p < 0:05), and TIDE score ( Figure 3(c), all p = 1:1e − 13) in cluster 2 versus cluster 1 and cluster 3 in LUAD. Cancer stem cells (CSCs) are believed to be responsible for tumor growth and maintenance, and they are involved in the resistance to conventional chemotherapy and radiation and tumor metastasis and recurrence. In our study, we found that LUAD patients in cluster 1 had a higher mRNAsi score versus cluster 2 and cluster 3 (Figure 3 3.3. The Functional Enrichment of ARGs. The GO analysis revealed that these ARGs were mainly associated with aging, cellular response to oxidative stress, cell aging, positive regulation of pri-miRNA transcription by RNA polymerase II, kinase regulator activity, growth factor binding, and protein kinase inhibitor activity (Figure 4(a)). Moreover, ARGs were mainly associated with aging, cellular response to chemical stress, gliogenesis, response to drug, mitolic cell cycle checkpoint, and circadian rhythm in KEGG analysis (Figure 4(b)).            Combined with these results, a total of 17 ARGs were suggested as potential prognostic biomarkers for LUAD,      (Figures 6(f)-6(h)). We then analyzed the correlation between the risk score of LUAD patients and immune cell infiltration, which demonstrated a negative correlation between risk score and the abundance of B cells (Figure 7(a), p = 2:65e − 12) and CD4+ T cells (Figure 7(b), p = 0:01). Moreover, risk score showed positive correlation with neutrophil immune infiltration (Figure 7(d), p = 1:86e − 4). However, there is no significant correlation between risk score and the abundance of CD8+ T cells (Figure 7(c)), macrophage (Figure 7(e)), and dendritic cells (Figure 7(f)).

Comprehensive Analysis of Prognostic Signature Genes.
The expression of FOXM1 (p = 8:2e − 5) and CDK1 (p = 0:00021) increases as clinical stage is increasing in LUAD (Figure 8(a)), suggested that FOXM1 and CDK1 may be involved in the progression of LUAD. TMB and MSI were suggested as predictive markers for tumor immu-notherapy efficacy [16]. In our study, the TMB score showed negative association with KL expression (p = 7:76e − 8) and positive association with FOXM1 expression (p = 2:47e − 21) and CDK1 expression (p = 2:2e − 18) (Figure 8(b)). Moreover, the TFAP2A expression was positively correlated with MSI score in LUAD (Figure 8(c), p = 0:008). We then analyzed the correlation between the correlation between existing therapy target and gene expression. The current study revealed that high TFAP2A expression and low CDK1 expression could be more resistant to drug resistance in LUAD (Figure 8(d)). We also analyzed the correlation between prognostic signature genes expression and immune cell infiltration in LUAD. As a result, the TFAP2A expression showed positive correlation with the level of CD4 + T cells and neutrophils (Figure 9(a)). The KL expression increased as the abundance of B cells, CD4+ T cells, CD8+ T cells, macrophage, and dendritic cells increased   Figure 9(b)). Moreover, the FOXM1 expression was significantly correlated with B cell infiltration and neutrophils (Figure 9(c)). The CDK1 expression decreased as the abundance of B cells and CD4+ T cells increased (Figure 9(d)).

Discussion
As one of a vital risk factor for malignancies, aging exerts a vital role in human morbidity and mortality [17]. Agingrelated genes were suggested as prognostic biomarker for types of cancer [18]. The significance role of ARGs in LUAD had not been fully clarified. Increasing evidences revealed that molecularly defined subtypes could provide novel strategies for the therapy and prognosis of lung cancer [4]. Thus, it is significant to develop effective prognostic signature and molecular subtype for lung cancer.
After obtained 27 differentially expressed aging-related genes (ARGs), we performed consensus clustering analysis, and three clusters of TCGA-LUAD patients with significant difference in prognosis, immune infiltration, chemotherapy ,and targeted therapy were identified. The result revealed that LUAD patients in cluster 3 had a poor overall survival. Moreover, further analysis suggested that LUAD patients in cluster 3 could be more resistant to commonly chemotherapy, targeted therapy, and immunotherapy. Increasing evidences revealed that molecular subtype classification of cancer with distinct biological characteristics could guide the development of precision treatment [19]. Ideal molecular subtype of LUAD was vital for the immune checkpoint blockade therapy and prognosis [20]. Our study identified three aging-related molecular subtypes of LUAD, providing more evidence for the precise treatment and prognosis improvement for LUAD.
Our study also identified 17 potential prognostic biomarkers for LUAD, including AGTR1, BUB1B, CAT,  CDK1, CDKN2A, CDKN2B, EGR1, FOXM1, IL6, KL,  LEPR, PPARG, PYCR1, RECQL4, SOCS2, TFAP2A, and  UCHL1. Based on these prognostic biomarkers, we developed an aging-related prognostic signature using LASSO cox regression analysis. Interestingly, this prognostic signature had a better performance in predicting the 1-year, 3year, and 5-year overall survival of LUAD. Previous studies had identified some prognostic signatures for LUAD. Lin et al. had developed pyroptosis-related prognostic signature in lung adenocarcinoma [21]. Another study also identified an immune-related signature that had a good performance in the prognosis of lung adenocarcinoma [22]. A robust ferroptosis-related signature could provide potential for the personalized outcome prediction for LUAD [23].
Actually, many aging-related signatures had been identified in cancers. In melanoma, a series of aging-related genes were associated with prognosis and responsiveness to immunotherapy [24]. Xue et al. also developed an aging-related prognostic signature for pancreatic adenocarcinoma [25]. Another study also constructed a prognostic signature based on 9-aging related genes for acute myeloid leukemia [26]. Moreover, another bioinformatics study also developed an aging-related signature, which had a good performance in risk stratification and prognosis prediction in lung squamous carcinoma [27].
Another vital finding of our study was that we also identified the lncRNA UCA1/miR-143-3p/CDK1 regulatory axis in LUAD. This regulatory axis may play a vital role of progression of LUAD. Previous study revealed that lncRNA UCA1 was a prognostic biomarker, and it could accelerate tumor proliferation and migration [28,29]. Moreover, previous studies suggested that the miR-143-3p expression was downregulated in lung cancer and correlated with biological process regulation [30,31]. CDK1 was upregulated in lung cancer, and it acts as a potential prognostic biomarker [32,33]. Interestingly, the data of our study further verified these results. Further in vivo and in vitro studies should be performed to verify this regulatory axis.

Conclusion
Our study identified three clusters of TCGA-LUAD patients with significant difference in prognosis, immune infiltration, chemotherapy, and targeted therapy. We also developed an aging-related prognostic signature that had a good performance in the prognosis of LUAD.

Data Availability
The analyzed data sets generated during the study are available from the corresponding author on reasonable request.