Pyroptosis-Related Gene Model Predicts Prognosis and Immune Microenvironment for Non-Small-Cell Lung Cancer

Non-small-cell lung cancer (NSCLC) has a high incidence and mortality worldwide. Moreover, it needs more accurate means for predicting prognosis and treatments. Pyroptosis is a novel form of cell death about inflammation which was highly related to the occurrence and development of tumors. Despite having some studies about pyroptosis-related genes (PRGs) and cancer, the correlation has not been explored enough between PRGs and immune in NSCLC. In this study, we constructed a PRG model by WGCNA to access the prognosis value PRGs have. The testing cohort (n = 464) with four datasets from the GEO database conducted a survival analysis to confirm the stability of the prognostic model. The risk score and age are examined as independent prognostic factors. Based on the PRGs, we found multiple pathways enriched in immune in NSCLC. Separating samples into three subtypes by consensus cluster analysis, Cluster 3 was identified as immune-inflamed phenotype with an optimistic prognostic outcome. A three-gene PRG signature (BNIP3, CASP9, and CAPN1) was identified, and BNIP3 was identified as the core gene. Knockdown of BNIP3 significantly inhibited the growth of H358 cells and induced pyroptosis. In conclusion, the model construction based on PRGs provides novel insights into the prediction of NSCLC prognosis, and BNIP3 can serve as a diagnostic biomarker for NSCLC.


Introduction
Lung cancer is a malignant disease with high mortality and is still the main reason for cancer death worldwide [1]. And non-small-cell lung cancer (NSCLC) accounts for approximately 85% of gross lung cancer [2] which is the most common and serious subtype of malignant tumor in lung tissue. In histology, NSCLC mainly includes lung adenocarcinoma (65%) and lung squamous cell carcinoma (30%) [2]. Despite there being clinical advances that have been gained in molecular diagnosis, chemotherapy, and biologically targeted therapy being beneficial for overall survival in 5 years to NSCLC patients, the prognosis is still gloomy [3][4][5]. Standard lobectomy and sublobar resection are the common treatments for non-small-cell lung cancer patients in the early stage with about 60% 5-year survival rate [6], whereas the survival rate in 5 years of the NSCLC patients with clinical tumor-node-metastasis (TNM) stage IIIB or IV are only 7% and 2%, respectively [7]. Due to the heterogeneity of tumors and the complicated etiology mechanism of cancer, limited survival advances eagerly suggested more biologically prognostic targets have to be applied to risk stratification and treatment optimization for earlystage patients on clinic. Many studies have proved that mRNA can be a kind of signature to predict the prognosis of cancer precisely [8][9][10]. Through microarray gene expression profile analyses and screening, establishing prognostic gene signature may remedy the patient stratification of the present staging system and provide more personalized treatment.
Pyroptosis belonging to programmed cell death is related to inflammation [11], which is mediated by gasdermin (GSDM) under different stimuli [12] that plasma membrane would rupture rapidly and releases proinflammatory content from cells. The typical way to induce pyroptosis is to activate the inflammasomes [13] and mainly executed by the gasdermin family [14]. In addition, gasdermin E (GSDME) cleaved by caspase-3 can also induce pyroptosis [15,16]. Generally, inflammatory vesicles, gasdermin proteins, and inflammatory cytokines are the key elements of pyroptosis that are related to the occurrence, development, invasion, and metastasis of tumors [17]. Nevertheless, the correlation between pyroptosis and cancer prognosis is not specific because of the dual effects of pyroptosis in tumor development. On the one hand, pyroptosis can suppress tumorigenesis and development through releasing inflammatory factors; on the other hand, pyroptosis creates a microenvironment for tumors with nutrition and accelerates the growth of tumors in various cancers [18][19][20][21]. In recent years, the dual effects of pyroptosis in cancer get popular and attract people to the study of the immune system [21]. And some studies indicated that pyroptosis would relate to the regulation of the tumor immune microenvironment [22].
The immune system plays a vital role in the NSCLC progression, and its components have been proved to be the major determinants of tumorigenesis and progression [23,24]. The immune tumor microenvironment including various immune cells and stromal cells is regarded as a mark and a necessary part of NSCLC. And all the cells are related to immune-related genes (IRGs) and some immunomodulators that influence the prediction of cancer prognosis significantly [25][26][27]. Some immunotherapies have been applied to fight against tumors through the immune system [28,29], but the valid part depends on the heterogeneity of the tumor microenvironment [30,31]. The characteristic of NSCLC is a few mutations in the immune system which may be beneficial for the patients from immunotherapy [32]. Besides, it infers that the immune can stimulate the mechanism to work because of the better prognosis with the presence of inflammatory cells in NSCLC and other solid tumors [33,34]. In the meanwhile, the stromal constituent in pulmonary tissue might play a role as a barrier in tumorigenesis by limiting the proliferation of tumor cells [35].
The classic immune subtype includes immune-inflamed, immune-excluded, and immune-desert phenotypes, which is related to response to anti-PD-1 and anti-PD-L1 antibodies [36,37]. NSCLC mainly belongs to the immune-inflamed and immune-excluded phenotypes with a high tumor mutation burden (TMB) [38]. Besides, there is also another kind of LUAD molecular phenotype that displayed a difference in the tumor immune landscape, which contains the terminal respiratory unit, proximal proliferative, and proximal inflammatory subtypes [39], whereas the impacts of molecular subtypes on non-small-cell lung cancer and the clinical outcomes remain unanswered.
In this research, we acquired the mRNA expression profile of lung adenocarcinoma and lung squamous cell carcinoma with clinical information from the TCGA database. Then, we constructed a nomogram model with three prognostic pyroptosis-related genes and tested it in the validation cohort from the GEO database. Besides, we explored the correlation between risk scores and immune subtypes acquired from consensus cluster analysis.

Data Acquisition and Preprocessing.
The RNA-seq matrix (count and FPKM value) in the TCGA-LUAD and TCGA-LUSC projects and corresponding clinical data was obtained from the Cancer Genome Atlas (TCGA) as the training cohort. And the validation dataset including GSE102287, GSE29013, GSE37745, and GSE50081 with its patient data was gained from the Gene Expression Omnibus (GEO) database. Additionally, the 146 pyroptosis-related genes and 2013 immune-related genes are from the Gene-Cards website and ImmPort portal, respectively. To ensure the consistency of expression level in each dataset, we removed the batch effect for all datasets by using the "sva" R package.
2.2. Weighted Gene Coexpression Network Analysis (WGCNA). As a bioinformatics method, WGCNA can construct a gene coexpression network by utilizing the "WGCNA" package [40] in R software (version 4.0.5). Two datasets from the TCGA database were merged and converted into the form of the TPM value. Constructing a sample tree to exclude the outliers in the samples of the training cohort is beneficial for data stabilization. We chose the power of 4 as the soft threshold when the scale-free R 2 is 0.9. And then, the adjacency matrix was defined by converting the expression profile based on soft-threshold. According to average linkage hierarchical clustering, TOM acquired from the adjacency matrix classified the modules with co-expression mRNAs. Next, we computed the module eigengene (ME) dissimilarity based on the ME expression profile after the Pearson correlation analysis. To merge the similar modules, we constructed ME tree after reclustering mRNAs in all modules while defining 0.6 as the height of the tree of merged modules.
In addition, we still need to screen out the module which highly associated with patient overall survival in these joined modules. Two analyses were conducted that one surveyed the correlation between the completed expression profile and ME expression level, and another explored the Pearson correlation between survival time which represented the clinical trait and mRNA expression profile (TPM value). Then, we explored the relationship between the results from two analyses in the module we selected to validate the significance of the module. To get the preliminary genes related to pyroptosis, we intersected the module genes and pyroptosisrelated genes and then painted PPI with the intersecting genes by using Cytoscape software (version 3.8.2).

Screening of Prognostic Genes and Calculating Risk
Value. The univariate Cox regression analysis was applied to access the prognostic significance of the pyroptosisrelated genes and preserve them whose p value is lower than 0.05 by using the "survival" R package. Using LASSO analysis to calculate the coefficient of significant genes in each sample based on the "glmnet" R package. And crossvalidation could prevent overfitting from happening to LASSO analysis. The computing formula of the LASSO approach is score = sum ðeach gene ' s expression level × λÞ, 2 Oxidative Medicine and Cellular Longevity and the prognostic genes signature were established based on it. According to the median risk score, the samples from the TCGA database were divided into high-and low-risk groups, also for the samples in the testing cohort. Afterward, the Kaplan-Meier analysis was executed to compare the OS in different risk groups in two datasets by using "survival" and "survminer" R packages. And "timeROC" R package painted ROC curve to predict the prognosis in 1, 3, and 5 years in the training dataset and 3, 6, and 9 years in the validation cohort.

Prognostic Module Construction and Enrichment
Analysis. The multivariate Cox regression analysis about clinical characteristics is aimed at finding out the independent prognostic factors judging by p value (p value < 0.05 means significant). And we can obtain the range of hazard ratio, 95% confidence intervals (CIs), and p value of every clinical parameter via the "survival" R package. Then, the nomogram was constructed with the independent prognostic factors and visualized by using the "regplot" R package. The proportional hazard assumption was executed using Schoenfeld residual. We first divided the data into three layers based on the survival time of patients. Then, the Schoenfeld residual was calculated by "survival" R package. The calibration curves of 1, 5, and 8 years were validation and prediction of overall survival which were painted after multivariate Cox analyses and random sampling. And the samples fell into three groups with 230 samples randomly in each group. Gene set enrichment analysis (GSEA) was plotted by "ReactomePA" [41] and "clusterProfiler" [42] packages in R software. And we applied GSEA based on the genes in the "MEdarkgreen" module, pyroptosis, the intersection of the module, and pyroptosis-related and testing cohort.

Identification of Immune Subtype.
To identify the immune subtype of the training cohort, the study used consensus clustering analysis and a series of related analyses. Applying the hierarchical cluster algorithm with 50 iterations to estimate the cluster robustness by the "Consensu-sClusterPlus" R package [43]. The cumulative distribution function (CDF) curve calculated the increasing proportion area under it for determining the best cluster number. "survival" and "survminer" R packages for survival analysis were used to check the overall survival of clusters 1 and 2 with risk subgroup, respectively. In addition, the study ordered the significantly enriched pathways of each immune subgroup through KEGG functional analysis, and the R packages of "clusterProfiler," "org.Hs.eg.db," and "enrichplot" involved this analysis and plotting.
2.6. Immune Infiltration. ESTIMATE algorithm was conducted on R software to calculate the immune score, stromal score, and ESTIMATE score of overall patients with immune subgroups based on the counts value of the training dataset. In the meanwhile, identifying the cell type of each sample in the training cohort (TPM value) by CIBERSORT analysis is aimed at assessing the proportion of 21 immune cells in every patient according to the abundance of immune and stromal cells. And just kept the samples that the p value is under 0.05. Besides, single-sample GSEA (ssGSEA) was conducted to assess and compare the immune cell scores and abundance of immune functions in two risk groups, respectively. TIMER2.0 website was an online public database of immune, and we downloaded the plot pictures of the correlation between the expression level of prognostic genes and six immune cells. The immunohistochemistry images of three prognostic genes were obtained from the HPA      2.7. The Screening of BNIP3. Rank forest (RF) algorithm was conducted in R software (version 4.0.5) by "randomForest" R package. And we deleted the genes whose importance score was less than 2. Separating samples from the TCGA database by "stringr" R package into the normal and tumor groups, we calculated the importance (mean decrease Gini) of each gene which was obtained after univariate Cox regression analysis. The genes were ranked in descending order by importance. The PPI network for searching hub genes was plotted by the "cytoHubba" application in Cytoscape software (version 3.8.2). The nodes with high score "cytoHubba" application calculated were red, and the nodes with low score were yellow. The ROC curves were obtained and visualized by "pROC" and "ggplot2" R packages in R software (version 4.0.5).
2.8. Cell Culture and Transfection. A lung cancer cell H838 was purchased from American Type Culture Collection (ATCC) and maintained in 1640 medium containing 10% fetal bovine serum (Gibco, Gland Island, USA) and 1% penicillin-streptomycin (Gibco, Gland Island, USA) and cultured at 37°C with 5% CO 2 . BNIP3-siRNA from Sangon Biotech (Shanghai, China) was used to silence the expression of BNIP3. In this study, the sequence of BNIP3-siRNA is sense: 5 ′ -GAUUACUUCUGAGCUUGCATT -3 ′ . H358 cells were seeded in 12-well plates at a density of 5 × 10 4 cells/well and transfected 24 hours later. BNIP3-siRNA or NC-siRNA was transfected to a final concentration of 20 nM using siRNA transfection reagent (Polyplus, France). Finally, the transfected cells were detected after 48 hours.
2.9. Cell Apoptosis Assay. H358 cells were treated with transfection reagents BNIP3-siRNA and NC-siRNA for 48 hours. After 48 hours, cells were collected and then incubated for 30 min at room temperature in the dark using the FITClabeled membrane-linked protein-V and PI (BD Biosciences) in the apoptosis kit. Eventually, apoptosis was detected using flow cytometry.
2.10. Western Blot. Proteins were extracted from cells and cell lysates were prepared using RIPA lysate with PMSF (Solarbio, Beijing, China), and then, protein quantification was performed using a BCA protein assay kit (Sangon Biotech, Shanghai, China). Proteins were then separated by The PPI network of pyroptosis-related genes in the "MEdarkgreen" module. The color and size of genes represent the degree value (database annotated). The big blue dots represent genes with a high degree, the small yellow dots are the genes with a low degree, and pink dots are the genes without a degree. The grey lines links dots represent the combined score and the thicker with higher relevance. 6 Oxidative Medicine and Cellular Longevity  3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 2 2 Log (λ)  The next day, after washing the membrane three times with TBST, the membrane was incubated for 1 h at room temperature with horseradish peroxidase-labelled secondary antibody (1 : 4000), followed by three washes with TBST. Finally, the 2.11. Immunohistochemistry (IHC). Tumor tissues were fixed with 4% paraformaldehyde, embedded in paraffin, and prepared into 5 μm thick sections. The sections were subsequently dewaxed with xylene and dehydrated in a gradient concentration of alcohol solution. For immunohistochemical experiments, antigen retrieval was performed with 0.01 M sodium citrate (pH: 6.0), and endogenous peroxidase was blocked by adding 0.3% hydrogen peroxide (H 2 O 2 ) and incubating in 10% goat serum albumin for 30 min. Sections were then incubated with primary antibody (BNIP3) at 4°C overnight. The next day, after incubation with HRPconjugated anti-rabbit secondary antibody for 1 h, samples were incubated with 3,3 ′ -diaminobenzidine (DAB), sections were counterstained with Mayer hematoxylin, dehydrated, cleared with xylene, and finally blocked with neutral resin and then observed on a multifunctional microtome (BIOTEK).
2.12. Statistical Analysis. All bioinformatics statistical analyses in the study were conducted using R software (version 4.0.5). The survival analysis was performed with a Kaplan-Meier assay and a log-rank test. To test the nomogram, we conducted proportional hazard assumption by calculating  the Schoenfeld residual. The Mann-Whitney test was used for the validation of ssGSEA comparing. All experiments were repeated at least three times and the data were expressed as mean ± SEM. ANOVA or t-test was used to determine the statistical significance of the control and experimental groups. In the investigation, data were considered statistically significant when p < 0:05.

Construction of Weighted Coexpression Network and
Identification of Trait-Module. The flow chart of the overall design in this study was shown in Figure 1. In this study, mRNA expression profiles with 1145 cases and 1026 pieces of patient clinical information from the TCGA-LUAD and TCGA-LUSC projects were assembled as the training cohort, and 4 datasets from the GEO database containing 464 samples were selected as our validation cohort after removing batch effect. Conducting WGCNA to construct a weighted coexpression network, the genes which have similar expression tendencies would be divided into one module. Every module was named with corresponding colors. When the scale-free topology fit index R 2 = 0:9, the power of softthresholding value reached 4. Therefore, we calculated the adjacency matrix and constructed the co-expression network based on the power value of soft-thresholding (Figures 2(a)  and 2(b)). To decrease the part of a subdivision in the network, we computed the module eigengenes (MEs) and the number of modules decreased from 59 to 39 (Figure 2(c)). The relation of modules with clinical traits was presented in the form of p value in the heat map (Figure 2(d)). Due to the positive correlation between module and survival time, we selected the "MEdarkgreen" module as our research object. To certify the reliability of the "MEdarkgreen" module, we analyzed the Pearson correlation between gene significance of the selected module and module membership (Figure 2(e)). Afterward, the genes in the "MEdarkgreen" module were intersected with pyroptosis-related genes and 35 pyroptosis-related genes in the module were obtained. Eventually, the interaction of these genes was analyzed by the STRING website (https://www.string-db.org/) and visualized the protein-protein interaction (PPI) network with Cytoscape software (version 3.8.2) (Figure 2(f)).

Prognostic Gene Identification and Nomogram Model
Establishment. Applying univariate Cox regression analysis to 35 pyroptosis-related genes to find out the prognosisrelated genes (Table S1), there were three genes including CAPN1, BNIP3, and CASP6 significantly related with prognosis (p value < 0.05) after analysis (Figure 3(a)). To further confirm the relation of three genes with overall survival (OS) in NSCLC patients, the LASSO regression model was constructed and cross-validation prevented the overfitting from LASSO analysis (Figures 3(b) and 3(c)). The risk coefficient of each sample was calculated after LASSO regression analysis, and the sample would divide into the high-and low-risk groups according to the median of the risk score (Figure 3(d)). The next step was to explore the correlation between risk score and survival status. As shown in Figure 3(e), the higher risk score is accompanied by more dead patients. Besides, we also checked the expression level of three prognostic genes in two risk groups, and only CAPN1 was expressed obviously in the high-risk group (Figure 3(f)). The detailed survival condition in two risk groups was presented in Figure 4(a), in which we can observe the relationship among survival time, survival status, and risk groups through survival analysis. The two groups had a significant difference in survival condition (p value < 0.05), and patients in the lowrisk group got a better prognosis. The ROC curves with 1, 3, and 5 years were painted to validate the accuracy of predicting prognosis, and the more area under the curve (AUC) means the higher reliability for prediction (Figure 4(b)). For improving the accuracy of the aforementioned analyses about survival conditions, we used the testing cohort which had merged four datasets from the GEO database to validate. The results were consistent with the training cohort except ROC was poor on reliability (Figures 3(g)-3(i), 4(c), and 4(d); Table S2). Afterwards, we compared our prognostic model with some published models by painting ROC curve (Figure 4(e)). The AUC value of our prognostic model was the highest in all of the compared models in 3 years (AUC = 0:721).
To construct the nomogram model, we firstly processed independence prognosis analysis by using the method of multivariate Cox regression analysis to screen the clinical parameters, age and risk score were the two clinical characteristics whose p value was under 0.05 (Table 1). Then, the nomogram model was built based on two independent prognostic factors to predict the survival rate in 1, 5, and 8 years ( Figure 5(a)). The proportional hazard assumption was conducted to test the nomogram by calculating the Schoenfeld residual ( Figure 5(b)). The p values of the two independent prognostic parameters are higher than 0.05, which means the proportional hazard assumption is effective and the establishment of nomogram is a success. The calibration curves of 1, 5, and 8 years presented the correlation between the rate of nomogram-predicted OS and observed OS and

Gene Set Enrichment Analysis.
Processing gene set enrichment analysis to four datasets mentioned in the materials and methods part to search the pathways and screen by calculating the p value (p value < 0.05). As observed, the pathways of the "Immune system" and "Innate immune system" appeared multiple times in three datasets (Figures 6(a)-6(c)). We found that "Immunoregulatory interactions between a lymphoid and a nonlymphoid cell" and "PD-1 signaling" pathways are significantly enriched in the validation cohort ( Figure 6(d)). Therefore, we deduced that pyroptosis may have a correlation with immune and explored more about immune by analyzing the pyroptosis-related expression profile.

Survival Analysis and KEGG Pathway Enrichment
Analysis with Immune Subgroups. Firstly, the list of the immune-related genes was obtained from the ImmPort website and intersected with the expression profile of the training cohort. And consensus clustering analysis identified the immune-subtype based on the immune-related expression matrix. As shown in Figure 7(a), the area under CDF curves no longer increased rapidly after k = 3. And the same result can be deduced by analyzing the relative change in area under the CDF curve (Figure 7(b)). Based on the matrix of clustering heat map, we decided to divide the patients into three subgroups (Figure 7(c)). To search the subgroups correlation with overall survival, we conducted survival analysis to the complete expression profile grouped by consensus analysis and the difference between the three subgroups was significant (p value = 0.032) (Figure 7(d)). For deep exploring the influence of risk score in clustering subgroups, we applied survival analysis to the three subgroups, respectively (Figures 7(e) and 7(f)). As observed in the figures, the prognosis in the low-risk group was better than in the high-risk group in cluster 1 obviously (p = 0:0053), but insignificantly in cluster 2 (p value = 0.3). Besides, KEGG pathway enrichment analysis was used for the three subgroups, and some enriched pathways were left (p value < 0.05) (Figures 7(g)-7(i)). In cluster 1, the pathway is mainly enriched in "Cytokine-cytokine receptor interaction," "Viral protein interaction with cytokine and cytokine receptor," "Chemokine signaling pathway," and so on, also in cluster 2 but a little different in cluster 3. The pathways including "Renin secretion," "Cytokine-cytokine receptor interaction," and "Renin-angiotensin system" were enriched in cluster 3.

Immune Infiltration.
Using ESTIMATE algorithm to calculate the scores of immune, stromal, and estimate and visualized on heat map image with immune cluster (Figure 8(a)). And both the immune score and stromal score in cluster 2 were the lowest, and it explained that both the immune and stromal cells are low-content in the samples of cluster 2. Thus, we deduced that cluster 2 was regarded as the immune-desert phenotype. The immune score was assessed highly in clusters 1 and 3. Nevertheless, the estimate score in cluster 1 was low which means that the immune and stromal scores are not enriched. The estimate score in cluster 3 was more stable and higher than in cluster 1. Therefore, we supposed that cluster 3 is the immune-inflamed phenotype. Afterward, the CIBERSORT analysis was conducted to check the proportion of immune cells in 21 types in each case and screen the samples whose p value was higher than 0.05 (Figure 8(d)). To deeply discuss the relation of immune status with risk scores, the ssGSEA analysis was performed on training (Figures 8(b) and 8(c)) and testing cohort (Figures 8(e) and 8(f)). As observed in the training cohort, the enrichment scores of macrophages and Treg were significantly different in the high-and low-risk groups (p value < 0.01) and the score of immune function including CCR and Para inflammation had an obvious difference (p value < 0.001). The enrichment scores of pDCs, Treg,  14 Oxidative Medicine and Cellular Longevity cytolytic activity, and inflammation-promoting in the validation dataset were significantly different (p value < 0.05). In addition, we downloaded some images about immune cells and immunohistochemistry from the TIMER2.0 website and the HPA database (The Human Protein Atlas), respectively. As shown in Figure S1, we can observe the correlation between six immune cells and three prognostic genes in lung adenocarcinoma and lung squamous carcinoma, but most of them had a slightly negative correlation. Figure S2 showed the images of immunohistochemistry of three prognostic genes in normal tissues, lung adenocarcinoma tissues, and lung squamous carcinoma tissues, respectively. And there were obvious differences between normal tissues and tumor tissues. Eventually, we explored the relevance between PRGs and immune checkpoints. The condition of the expression level of 35 PRGs (before univariate Cox analysis) in immune subtypes was presented in Figure S3A. The BNIP3 was expressed significantly in clusters 1 and 3 but had a low expression in cluster 2. The CAPN1 was expressed highly in cluster 1 but expressed low in cluster 3. And the CASP6 was expressed highly in cluster 3. The expression level of TIGIT and LAG3 in the immune checkpoints in the two risk     groups is different significant, and both of them were expressed higher in the high-risk group ( Figure S3F). Therefore, we checked the Pearson correlation coefficients between the two immune checkpoints and prognostic PRGs (Figure S3B-E, G, H). The BNIP3 had an obvious negative correlation with the two immune checkpoints, respectively. The CAPN1 only negatively correlated with TIGIT, whereas the CASP6 did not present any relation with the two immune checkpoints.
3.6. BNIP3 Is Highly Expressed in Lung Adenocarcinoma, and Knockdown of BNIP3 Induces Lung Cancer Cell Pyroptosis. Eventually, we screened the genes further based on the prognostic model. The three prognostic signatures' expression levels were up-regulated in the tumor group ( Figure S4A). The random forest (RF) algorithm was conducted to calculate the importance of each gene after the univariate Cox regression analysis ( Figure S4C). The importance of genes means the contribution degree of genes to the prognostic model. And BNIP3 contributes most in the three prognostic genes (mean decrease Gini = 9:78), CAPN1 contributes least in the three prognostic genes (mean decrease Gini = 2:29). Subsequently, the PPI network was constructed to search the hub genes ( Figure S4B). However, only BNIP3 and CASP6 were selected in the prognostic genes to build the hub genes network. Hence, we removed CAPN1 and continued to explore the details of BNIP3 and CASP6. Plotting ROC curves based on the expression level of BNIP3 and CASP6, respectively ( Figure S4D, E), the AUC value of BNIP3 (AUC = 0:853) was higher than the value of CASP6 (AUC = 0:848). In summary, BNIP3 became our final target signature to explore.
To further explore the role of BNIP3 in lung cancer cells, five pairs of human lung cancer tissues and their paraneoplastic tissues were incubated with BNIP3 as primary antibody, and the results of immunohistochemistry showed that BNIP3 was highly expressed in lung adenocarcinoma compared with normal tissues (Figure 9(a)). We also took 4 pairs of human lung cancer tissues and their paraneoplastic tissues and extracted the proteins for western blot experiments, and the results showed that BNIP3 were upregulated in lung cancer tissue proteins (Figure 9(b)). After treating H358 cells with transfection reagent for 48 h, we examined the cell death using an apoptosis kit in order to detect the effect of knocking down BNIP3 on H358 cells, and the results showed that the cell death rate was significantly increased in the knockdown BNIP3 group (Figure 9(c)). Knocking down BNIP3 with transfection reagent, then we extracted proteins and detected GSDMD and caspase-8 by western blot, and the results showed that knocking down BNIP3 increased GSDMD and caspase-8, indicating that knocking down BNIP3 could induce pyroptosis in lung cancer cells H358 (Figure 9(d)).

Discussion
Pyroptosis induction can thoroughly remove neoplastic cells in multiple cancers [44]. However, due to the activation of pyroptosis, some inflammatory mediators will be released which might promote the occurrence and progression of cancer [45,46]. The respiratory system is sensitive to pyroptosis induction [47]. The piperlongumine analogue L50377 induced pyroptosis by stimulating reactive oxygen species (ROS) to mediate the suppression of NF-κB in NSCLC [48]. Additionally, the downregulation of lncRNA-XIST would activate pyroptosis mediated by the miR-335/SOD2/ ROS signal pathway to suppress the development of NSCLC [49]. Many studies suggested that ROS is related to pyroptosis [50,51]. Nevertheless, the mechanism of how pyroptosis influences NSCLC and what kind of regulations the predictors do via the pathways are not clear until now. Therefore, we constructed a prognostic model based on three pyroptosis-related genes by using bioinformatics methods which were validated in the testing cohort to prove its availability and benefit to early diagnosing for NSCLC patients.
In this investigation, we determined three pyroptosisrelated genes including CAPN1, BNIP3, and CASP6 as prognostic signatures of NSCLC based on the expression profile with 146 pyroptosis-related genes from the TCGA database, which are overexpressed in the tumor tissues. Calpain 1 (CAPN1) is a type of cysteine activated by calcium with proinflammation [52], which is widely expressed in vivo and has been proved as a promoter of cancer progression that is significantly related to poor prognosis [53][54][55]. The Calpain family where CAPN1 belongs can influence the malignancy phenotype of lung cancer cells by degrading proteins according to some reports [56][57][58][59] and also is involved in various cellular processes containing cell signal transduction and apoptosis, etc. [52]. And several studies have verified the importance of the Calpain family in tumor migration and invasion [60]. Meanwhile, CAPN1 may be a biomarker of tumor or a potential target used to diagnose and treat lung adenocarcinoma [61]. The CAPN1 rs17583C>T was related to a better prognosis, which provided certification of the functional relationship between genetic mutation and the better clinical outcomes [62]. BNIP3, a member of the Bcl-2 protein family with mitochondrial BH3 [63], affects different ways of cell death in hypoxic conditions [64] and also various metabolic pathways [65,66]. BNIP3 is regarded as a proapoptotic protein, whose dysregulated expression is related to mitophagy, autophagy, and pyroptosis [67][68][69]. Tumor cells are sensitive to cisplatin and gemcitabine when BNIP3 upregulates [70]. Interestingly, the expression level of BNIP3 will elevate in early-stage adenocarcinoma but decrease in metastasis progression. And the deletion of BNIP3 can increase angiogenesis which promotes tumorigenesis and the metastasis of breast cancer [71]. Also, BNIP3 was inferred as an independent prognostic factor that related to autophagy in earlystage NSCLC but not clear in advanced lung cancer [63].

22
Oxidative Medicine and Cellular Longevity CASP6 was suggested that may be related to apoptosis [72] which attends to the occurrence and development of cancer by promoting the activation of the ways of programmed cell death in tumor issues [73]. And it has been detected in some reports about pyroptosis as a prognostic biomarker in cancer [74,75]. Besides, CASP6 can mediate the activation of the innate immune system and inflammasomes [76]; also, its mutation associated with tumors can decrease the overall catalytic turnover [77].
The result of GSEA based on the pyroptosis-related genes suggested that the immune may play a relevant role in pyroptosis and non-small-cell lung cancer and the pathways are mainly enriched in the immune system and innate immune system. The inflammation induced by pyroptosis can activate antitumor immunity and synergy with checkpoint blockade [78]. The inflammasome is the key mediator of lung immunity that would be regulated to release caspase-1 [79]. The activation of caspase-1 will trigger the pyroptosis [80]. Caspase-1-dependent process and secrete IL-1β and IL-18 by means of inflammatory caspases activation and the innate immune responses induced by macrophages [81,82]. Besides, many studies identified that CD8+ T cells and NK cells can suppress tumors through the induction of pyroptosis of tumor cells in the immune microenvironment, and the tumor cells under pyroptosis would recruit immune cells for suppressing tumors, too [18,78]. However, inducing pyroptosis cannot be beneficial for all immunotherapy modalities and sometimes would be required to work with ICIs to kill the cold tumor cells efficiently [78].
The immune molecular phenotype may help identify the patients who would benefit from immunotherapy [83]. In this research, we identified three immune subtypes by The expression of normal lung tissue protein and lung cancer tissue protein expression were detected by western blotting assay. Densitometry of the ration of BNIP3 was shown as bar chart. All data were representative of at least three independent experiments and presented as mean ± SD, * * p < 0:01 compared with the control. 723, 719, 683, and 733 are patient numbers. (c) Using flow cytometry to detect cell death of H358 after knockdown BNIP3. (d) Pyroptosis-related proteins were detected by western blotting assay. Densitometry of the ration of protein was shown as bar chart. All data were representative of at least three independent experiments and presented as mean ± SD, * p < 0:05, * * p < 0:01 compared with the control. 23 Oxidative Medicine and Cellular Longevity consensus cluster analysis. Cluster 2 was supposed to be the immune-desert phenotype because of the low immune and stromal scores, and cluster 3 was the immune-inflamed phenotype with the high immune score. BNIP3 had a low expression in the high-risk group and cluster 2. Meanwhile, the high-risk group and cluster 2 had poor prognoses compared with the low-risk group and cluster 3, respectively. Hence, we supposed that the low expression level of BNIP3 suggested the immune-desert phenotype and the high level of immune cell infiltration. And the high-risk group had a poor prognosis because of it.
In conclusion, we constructed a prognostic nomogram model with three pyroptosis-related genes and validated it in another dataset and also explored the correlation between pyroptosis-related and immune microenvironment based on immune subgroups at the same time. A three-gene PRG signature (BNIP3, CASP9, and CAPN1) was identified, and BNIP3 was identified as the core gene. Knockdown of BNIP3 significantly induced pyroptosis. In conclusion, the model construction based on PRGs provides novel insights into the prediction of NSCLC prognosis, and BNIP3 can serve as a diagnostic biomarker for NSCLC. However, there are still some limitations in this research. The patient data was all acquired from the public database so that we did not use more data to strengthen the reliability of the prognostic model. The lack of experiments in vivo also makes our study results unable to be further verified. Therefore, we need forward-looking studies to improve the availability of this prognostic model.

Data Availability
The data used to support the findings of this study are included within the article.