Abnormal Expression and Prognostic Significance of Bone Morphogenetic Proteins and Their Receptors in Lung Adenocarcinoma

Background Lung adenocarcinoma (LUAD) is one of the most life-threatening malignancies. The crucial role of bone morphogenetic protein (BMP)/BMP receptors reveals the significance of exploring BMP protein-related prognostic predictors in LUAD. Methods The mRNA expression of BMPs/BMP receptors was investigated in LUAD and normal lung tissues. Gene Ontology and the Kyoto Encyclopedia of Genes and Genomes pathway analyses were performed, and the prognostic values were assessed by Kaplan-Meier Plotter. Univariate and multivariate Cox regression analyses were executed to ascertain the correlation between overall survival (OS) and the mRNA expression of BMPs/BMP receptors. The receiver operating characteristic (ROC) curves were implemented to evaluate the predictive power of the prognostic model. Then, the prognostic model was validated in the GEO cohort. Furthermore, a nomogram comprising the prognostic model was established. Results The mRNA expression of BMP2/5/6/R2, ACVRL1, and TGFBR2/3 was lower in LUAD tissues than in normal lung tissues. High expression of BMP2/4/5/R1A/R2, ACVR1/2A/L1, and TGFBR1/3 was associated with better OS, while BMP7 and ACVR1C/2B were associated with poorer OS. Three genes (BMP5, BMP7, and ACVR2A) were screened by univariate and multivariate Cox regression analyses to develop the prognostic model in TCGA. Significantly better survival was observed in LUAD patients with a low-risk score than those with a high-risk score. The ROC curves confirmed the good performance of the prognostic model, then, the prognostic model was validated in the GSE31210 dataset. A nomogram was constructed (AUCs>0.7). And hub genes were further evaluated, including gene set enrichment analysis and immune cell infiltration. Conclusions BMP5, BMP7, and ACVR2A are potential therapeutic targets in LUAD. The three-gene prognostic model and the nomogram are reliable tools for predicting the OS of LUAD patients.


Introduction
Lung cancer is the main cause of cancer-related deaths [1]. Non-small cell lung cancer (NSCLC) is the most common type of lung cancer, of which, adenocarcinoma is the main subtype [2,3]. NSCLC is one of the most successful applications of precision medicine [4]; no doubt investigating new molecular biomarkers or therapeutic targets has become a research trend [5]. Numerous novel biomarkers for cancer treatment based on large-scale, genome-wide association research databases have emerged [6], but there is an urgent need to identify novel molecular biomarkers and therapeutic targets for lung adenocarcinoma (LUAD).
Researchers have discovered that BMPs and BMP receptors could affect the prognosis of patients in multiple types of cancer including gastric cancer [7], colorectal cancer [9], and lung cancer [10]. Moreover, the dual role of BMPs in both cancer development and suppression has led BMPs to be regarded as powerful therapeutic targets [11]. However, the underlying functions and mechanisms of BMPs and BMP receptors remain unclear, and a comprehensive mRNA profiling of BMPs and BMP receptors in LUAD has not been performed. The present study involved database research and in-depth bioinformatic analyses to determine the prognostic significance of BMPs and BMP receptors in LUAD.

Materials and Methods
2.1. Data-Mining Analysis by ONCOMINE Database. The online cancer database ONCOMINE (https://www .oncomine.org/) was beneficial to data mining of the transcriptional expression of BMPs/BMP receptors in 20 types of cancer. The student's t-test was used to assess whether there was a significant difference between the transcriptional expression of BMPs/BMP receptors in LUAD samples and those in normal lung samples. The P < 1e − 4 and fold change ðFCÞ > 2 were selected as the cut-off criteria, respectively.

The Relationship between the Abnormal Expression of
BMPs/BMP Receptors and the Characteristics of LUAD. The LUAD dataset containing data from 517 cases was from cBioPortal (http://www.cbioportal.org/), an open-access dataset used to sort out multiple cancer genes. The Mann-Whitney (M-W) test was employed to compare the mRNA expression of BMPs/BMP receptors between two groups of LUAD patients with different clinical features. The Kruskal-Wallis (K-W) test was used for multiple group comparison. If the results of the K-W test were significant, a further Dunn's multiple comparison test was conducted.

Gene Ontology and Pathway Enrichment Analysis. The
Gene Ontology (GO) analysis was performed to investigate the function of BMPs/BMP receptors using the R package "clusterProfiler" [12]. The Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis was also executed.

Kaplan-Meier
Plotter. According to the median gene expression in the online openly available database Kaplan-Meier Plotter (http://www.kmplot.com/), the mRNA transcription level of BMPs/BMP receptors was divided into two groups (high vs. low). Importantly, their prognostic values were investigated, and the hazard ratio (HR) with 95% confidence intervals (CI) and Log-rank P values were displayed. P value < 0.05 was considered as the cut-off criteria. The Kaplan-Meier (K-M) survival curves were exhibited based on the value of the most detected probe for each BMPs/BMP receptor.
2.5. Definition of the Gene-Related Prognostic Model. The mRNA expression profiles and the corresponding clinical information from the LUAD patients were retrieved from The Cancer Genome Atlas (TCGA) database, containing 533 LUAD tissues. After preprocessing (including removal of abnormal values, screening of samples with tumor purity greater than 60%, expression matrix standardization and filtering, and removal of genes with low expression levels), 322 samples were obtained. Univariate and multivariate Cox regression analyses were executed to explore the correlation between overall survival (OS) and the mRNA expression of each BMPs/BMP receptor. In the univariate Cox regression analyses, when the P value < 0.05, the gene was considered significant. Then, multivariate Cox regression analyses were facilitated to evaluate the contribution of a gene as an independent prognostic factor for patient survival. A threegene-based prognosis risk score was built based on a linear combination of the regression coefficient obtained from the multivariate Cox regression model (β) multiplied by the mRNA expression level. The optimal cut-off value used to classify patients into low-risk and high-risk groups was identified using X-tile software. The 322 LUAD patients with survival data were separated into low-risk and high-risk groups based on the optimal cut-off value. The K-M survival curves of low-risk or high-risk cases were exhibited. Timedependent receiver operating characteristic (ROC) curves were used to evaluate the predictive power of the prognostic model. Then, the prognostic model was validated in the Gene Expression Omnibus (GEO) dataset (GSE31210).
2.6. Building and Validating a Predictive Nomogram. Univariate and multivariate Cox regression analyses were implemented, with the clinical traits as independent variables and the OS as the dependent variable. All reported P values were two-sided, and the HR and 95% CI were displayed. In this study, a combined model constructed on all independent predictive factors derived from multivariate Cox regression analyses was used to develop a nomogram to evaluate the probability of 1-, 3-, and 5-year OS in LUAD patients. Subsequently, verifications including discrimination and calibration were carried out. The discriminative power of the nomogram was assessed using the concordance index (Cindex) through a bootstrap method with 1000 resamples. The value of the C-index fluctuated between 0.5 and 1.0, where the closer the value is to 1.0, the more perfect the ability to correctly distinguish the results of the model. The calibration curve of the nomogram was investigated graphically, and overlapping with the reference line indicated that the model was very suitable. Decision curve analysis (DCA) was conducted to determine the clinical utility of a predictive nomogram. At the same time, ROC curve analyses were performed to compare the predictive accuracy of the combined model with those of a single significant prognostic factor. 2.7. Gene Set Enrichment Analysis. Gene set enrichment analysis (GSEA) of a single key gene was conducted using the R

Transcriptional Levels of BMPs/BMP Receptors in LUAD
Patients. The BMPs/BMP receptors transcriptional level was compared between various cancers and para-carcinoma tissues based on the ONCOMINE database. BMPs/BMP receptors were generally downexpressed in most tumors as presented in Figure S1. Most BMPs/BMP receptors were downregulated in lung cancer tissues, except BMP7, ACVR1C, and ACVR2B which were upregulated in one dataset. This might be due to the limited number of samples. The mRNA level of all BMPs/BMP receptors significantly decreased in LUAD tissues (Table 1).

The Association of BMPs/BMP Receptors mRNA
Expression and the Clinical Features of LUAD Patients. The relationships between the mRNA expression of BMPs/BMP receptors and the clinical characteristics of LUAD patients were evaluated showing no significant difference between <65 and ≥65 groups, except for TGFBR2/3, BMPR2, and ACVR2A/2B, with only ACVR2B reduced in the older group (Figure 1(a)). A significant difference in BMP3/6/7, ACVRL1, and TGFBR2/3 expression was observed across gender ( Figure 1(b)). BMP4, BMPR1A/2, ACVR1B, and TGFBR1 exhibited lower expression in White than in non-White individuals (Figure 1(c)). Interestingly, only the        The mRNA expression of BMP7 (P = 0:0260) was significantly increased in the central lung group, while BMPR1B (P = 0:0110) mRNA expression was significantly increased in the peripheral lung group (Figure 1(f)). Upregulated ACVR1C and ACVR2A/2B were also associated with distant metastasis (Figure 1(j)). Lymph node status had a relation-ship with BMPR1B and ACVR2A/2B mRNA expression (Figure 1(i)).

Functional Annotation of the BMPs/BMP Receptors.
The GO analysis showed that changes in the BP were significantly enriched in the transmembrane receptor protein serine/threonine kinase signaling pathway and its related regulation and restricted Smad protein phosphorylation and its related regulation (Figure 2(a)). For the CC, these genes were enriched in the plasma membrane receptor complex, serine/threonine kinase complex, protein kinase complex, transferase complex, and transferring phosphorus-containing groups (Figure 2(a)). Variations in MF were mainly enriched in Smad binding, TGF-β activated receptor binding, and transmembrane receptor protein serine/threonine kinase activity (Figure 2(a)). The KEGG pathway analysis revealed that the TGF-β signaling pathway, cytokine-cytokine receptor interaction, and Hippo signaling pathway were associated with the genes (Figure 2(b)).

The Prognostic Significance of BMPs/BMP Receptors in LUAD Patients.
The prognostic significance of the mRNA expression of BMPs/BMP receptors in LUAD patients was determined through the Kaplan-Meier plotter, revealing that 12 BMPs/BMP receptors were significantly associated with OS. High expression of BMP2, BMP4, BMP5, BMPR1A, BMPR2, ACVR1, ACVR2A, ACVRL1, and TGFBR3 was associated with better OS, while BMP7, ACVR1B, and TGFBR2 were associated with poorer OS (Table 2).

A Good Performance for the Prognostic Model in
Predicting the OS of LUAD Patients. Univariate Cox regression was performed in the TCGA to ascertain the correlation between differentially expressed genes and OS of LUAD patients and identified three genes that were significantly related to the OS of LUAD patients when the P value was < 0.05. Then, a multivariate Cox regression analysis was implemented after preliminary filtering, and finally, three genes were selected to establish a prediction model. The predictive model was characterized by the linear combination of the expression levels of the three genes weighted by their relative coefficient from the multivariate Cox regression as follows: risk score = ð−0:12332 * mRNA expression level of BMP 5Þ + ð0:08118 * mRNA expression level of BMP7Þ + ð− 0:35362 * mRNA expression level of ACVR2AÞ.
In this study, for the 316 patients with survival time, the three-gene expression risk score was calculated, and the Xtile was used to obtain the optimal cut-off value of the risk score ( Figure S2). A total of 83 patients were classified as high-risk patients because their risk scores were higher than the cut-off value, while the other 233 patients were assigned to the low-risk group ( Figure S3). Based on the three genes, the K-M curves of the two groups were significantly different (P < 0:0001; Figure 3(a)). The prognostic capacity of the three-gene signature was estimated by calculating the AUCs of the time-dependent ROC curves (Figure 3(a)), which were 0.70, 0.64, 0.63, and 0.66 for the 1-, 2-, 3-, and 5-year survival times, respectively, indicating that the prognostic model had high sensitivity and specificity (Figure 3(a)).
To assess the predictive value of the prognostic model in predicting the OS of LUAD patients in other datasets, the prognostic model was evaluated in the GSE31210 cohort. A total of 226 patients were categorized into the low-risk group (n = 148) and high-risk group (n = 78) using the optimal risk cut-off value of the GEO cohort (the same as the risk scoring model of the TCGA cohort) ( Figure S2, S3). Consistent with the results in the TCGA, the OS of LUAD patients in the GSE31210 of the high-risk group was significantly higher than that of the low-risk group (P = 0:02; Figure 3(b)). In addition, the time-dependent ROC curves on the survival prediction of the prognostic model demonstrated that the AUC at 1-year was 0.79, at 2-years was 0.74, at 3-years was 0.63, and at 5 years was 0.59, indicating that the prognostic model could predict the OS of LUAD patients (Figure 3(b)).

Construction of a Predictive
Nomogram. The predictive model with the univariate Cox regression analysis, clinical covariates of pathologic stage, tumor size, and lymph node metastasis had some predictive value; however, sex, age, race, EGFR mutation, gross pathology, and distant metastasis did not correlate with OS (Figure 4(a)). Therefore, the pathologic stage, tumor size, lymph node metastasis, and the three-gene prognostic model (risk score) were incorporated into the multivariate Cox regression analysis, showing that they were independent prognostic factors associated with OS (Figure 4(a)).
To establish a clinically applicable method for predicting the survival probability of LUAD patients, a nomogram was developed to predict the probability of the 1-, 3-, and 5year OS in the TCGA cohort. The predictors of the nomogram included four independent prognostic factors (pathologic stage, tumor size, lymph node metastasis, and the three-gene prognostic model; Figure 4(b)). The C-index for the model for evaluation of OS was 0.6696159, with 1000 cycles of bootstrapping, bias-corrected-C-index was 0.6771448. The calibration plots illustrated that the   (Figure 4(c)). DCA evaluates clinical usefulness, showing that the nomogram exhibited a good net benefit (Figure 4(d)). Also, compared with the pathologic stage, tumor size, lymph node metastasis, and the three-gene prognostic model, the AUCs of the nomogram were the largest (AUCs > 0:7) (Figure 4(e)), confirming that the nomogram was a good tool for predicting survival for patients with LUAD, which might be a benefit for patient consultation, decision-making, and follow-up schedule.

Gene Set Enrichment Analysis of BMP5/7 and ACVR2A.
The GSEA analysis showed that the low expression of ACVR2A and BMP7 was enriched in "cell cycle," "purine metabolism," "pyrimidine metabolism," "ribosome," and "spliceosome." The "cytokine-cytokine receptor interaction" was enriched in the high expression levels of BMP5 and the low expression of ACVR2A ( Figure 5). BMP5, BMP7, and ACVR2A were involved in DNA proliferation, purine and pyrimidine nucleoside biosynthesis, purine and pyrimidine metabolism. Moreover, there was a close relationship between cytokines and BMP5/7, ACVR2A.

The Expression of BMP5/7 and ACVR2A Positively
Correlated with Infiltrating Immune Cells in LUAD. TILCs are associated with sentinel lymph node metastasis and prognosis of tumors, and tumor purity impacts immune infiltration in the analysis of clinical sample data based on genetic testing. The BMP5/7 expression significantly negatively correlated with tumor purity, while ACVR2A had no significant correlation with tumor purity (Figure 6). Interestingly, there was a significant positive correlation between all three genes and the level of infiltrating CD4 + T cells and macrophages ( Figure 6). These findings strongly demonstrated that BMP5/7 and ACVR2A could recruit immune cells in the tumor microenvironment (TME) in LUAD, especially CD4 + T cells and macrophages.

Discussion
The present study sought to explore the involvement of BMPs in LUAD by analyzing publicly available data, specifically focusing on their prognostic value. Three hub genes, BMP5, BMP7, and ACVR2A, were identified, and their mRNA expression was relevant to the prognosis value in LUAD patients using the TCGA-LUAD database. These three hub genes were used to construct a prognostic model, which was further validated in another GEO database (GSE31210). Furthermore, a practical nomogram was developed.

18
BioMed Research International the expression of BMP4 was positively correlated with the widespread metastasis of human tumors [28]. BMP3 and BMP6 mRNA expression is downregulated in NSCLC tissues and cell lines, which is related to the development of lung tumors [29,30], and the expression of BMP6 was also downregulated in LUAD tissues in the present study. BMP6 was a tumor suppressor of lung cancer, and the loss of BMP6 expression was tightly associated with the poor prognosis of the NSCLC patients [31]. Based on our results, BMPR1A/2 expression had a prognostic value in LUAD patients. Fur-thermore, BMPR1A, BMPR1B, and BMPR2 were associated with clinical factors, including age, ethnicity, tumor size, and lymph node metastasis. BMP7 signaling via BMPR1A and BMPR1B could inhibit the proliferation of lung cancer cells [32]. ACVR1 correlated with better OS in our study but should be regarded as a complex regulator of cancer biology, because ACVR1 could act as a tumor suppressor or oncogene, depending on the type of cancer or cell, or ligand involved [33]. The ACVR1B gene was correlated with the risk of lung cancer in adults exposed to environmental tobacco  19 BioMed Research International smoke [34]; similarly, ACVR1B was associated with poor prognosis in the present study.
The expression of TGFBR2 in NSCLC tissue samples was low compared to adjacent normal lung tissue [35], and the expression of TGFBR3 in NSCLC was lower than that in normal lung tissue [36]. We observed similar results in LUAD tissues. Moreover, the expression of TGFBR3 was downregulated in the blood and tumor tissue cells of patients with stage I LUAD [37]. The high expression of TGFBR2 was an important risk factor for the diminishing of OS and disease-free survival (DFS) in NSCLC patients [38], and the P value of the prognostic value of TGFBR2 based on the Kaplan-Meier Plotter was also <0.05, which suggests that there was a tight relationship between the expression of TGFBR2 and the prognosis of LUAD patients. Regulating TGFBR2 signaling could induce multidrug resistance in LUAD [39]. Decreased expression of TGFBR2 in human NSCLC was related to more invasive tumor behavior and inflammation, and the deletion of TGFBR2 in mouse airway epithelial cells promoted the formation of adenocarcinoma [40]. Global profile analysis of LUAD identified TGFBR2 as an aggressive inhibitor [41].
The mRNA and protein expression of BMP5 in LUAD tissue were remarkably high compared with lung squamous cell carcinoma tissue [42]. Our bioinformatics analysis also revealed decreased expression of BMP5 in LUAD tissue compared to normal lung tissue. Besides, downregulated BMP5 correlated with poor prognosis. It had been illustrated that epithelial-mesenchymal transition induced by TGF-β was mediated by Blimp-1-dependent inhibition of BMP5 [43]. PinX1-arrested cell cycle transition led to the NSCLC cell proliferation, and BMP5 might take part in PinX1associated cell proliferation and cell cycle transition [44]. Although BMP5 may play a role in LUAD, the underlying mechanisms have not been fully elucidated. The GSEA analysis revealed that BMP5 was involved in the calcium signaling pathway, cytokine-cytokine receptor interaction, TGF-β signaling pathway, gonadotropin-releasing hormone (GnRH) signaling pathway, and mitogen-activated protein kinase (MAPK) signaling pathway. The GnRH signaling pathway not only plays an important role in human development but also in the occurrence and development of human cancer [45,46]. Also, previous studies have shown that the MAPK signaling pathway plays a vital role in preventing the occurrence and development of cancer [47]. These candidate pathways mentioned above demanded further experimental verification and could be clues of future mechanism studies.
It has been shown that BMP7 not only promoted growth stimulatory but also promoted inhibitory effects in tumor cells [48][49][50], and BMP7 was related to the postoperative prognosis of NSCLC patients [51]. In this study, we discovered that the expression level of BMP7 in LUAD tissues was not significantly different from that in adjacent noncancerous tissues, but upregulated BMP7 correlated with poor prognosis, female, central lung cancer, and lower T stage in LUAD patients, indicating that BMP7 had a close relationship with LUAD. Previous studies have shown that the expression of BMP7 was related to lymph node metastasis in patients with lung cancer [48]. Moreover, BMP7 expression in NSCLC was correlated with the progression of the tumor and poor prognosis [51]. Correspondingly, the GSEA

20
BioMed Research International analysis showed that the low expression of BMP7 was enriched in the cell cycle, DNA replication, proteasome, purine metabolism, and pyrimidine metabolism. When BMP7 binds to BMPRII, it then binds to BMPRI. Restricted Smads (R-Smads) include Smad1/5/8, also known as Smad1/5/9, are phosphorylated, and then enter the nucleus with Smad4, where they regulate target genes [52]. BMP7 was regarded as an effective target for inhibiting cell growth and inducing cell apoptosis [53,54]. Several reports have demonstrated that BMP7 promoted cell invasiveness and motility of lung cancer cells [42,55]. The miR-137/BMP7 axis could contribute to the progression of NSCLC [56].
In the present study, upregulated ACVR2A correlated with poor prognosis in LUAD patients, and the GSEA analysis showed that the low expression of ACVR2A was enriched in the cell cycle, cytokine-cytokine receptor interaction, purine metabolism, and pyrimidine metabolism. Activin A could inhibit the transmission of BMP signals through combining with ACVR2A and ACVR2B [8]. Unfortunately, few studies have focused on the role of ACVR2A in LUAD.
Gene-based risk assessment models for lung cancer are emerging [57,58]. In the present study, the three hub genes (BMP5, BMP7, and ACVR2A) were used to construct a prognostic model. The GSE31210 cohort was used to validate the prognostic model predictive power. Fortunately, for both the TCGA and GEO cohorts, the AUCs of the ROC curves for the prognostic model indicated that the three-gene signature had a good performance for survival prediction. We also developed a nomogram to accurately predict the likelihood of OS in LUAD patients. The calibration plots demonstrated that actual survival was closely related to predicted survival, confirming good prediction performance. At the same time, we proved that the nomogram was effective in terms of Cindex, AUC, and DCA. Overall, the BMPs/BMP receptors prognostic signature accurately predicted survival outcomes of LUAD patients in our study, thereby has potential for clinical applications.
However, several limitations should be considered. First, the information in the TCGA database was limited, for example, there were 218 patients with missing information about radiotherapy. Second, our nomogram was not externally verified in the GEO database because GSE31210 lacked clinical data. A robust nomogram should be verified externally in different cohorts; therefore, the nomogram needs to be further verified in multicenter clinical trials and prospective studies. In the future, we will also explore the possibility of incorporating more prognostic variables and other regression modeling methods to further improve performance and prediction accuracy.

Conclusions
In conclusion, the expression of BMPs/BMP receptors in LUAD was different from normal lung tissues, and the high expression of most of them was associated with poor prognosis in LUAD patients. The three-gene (BMP5, BMP7, and ACVR2A) prognostic model performed well in predicting the OS of LUAD patients. And the nomogram comprising the prognostic model was a reliable tool for predicting the OS of LUAD patients. BMP5, BMP7, and ACVR2A are potential therapeutic targets for LUAD.

Data Availability
The data sets analyzed during this study are publicly available databases, such as The Cancer Genome Atlas (TCGA) database and Gene Expression Omnibus (GEO) dataset.