Construction and Validation of a Macrophage-Associated Risk Model for Predicting the Prognosis of Osteosarcoma

Background Osteosarcoma is one of the most common bone tumors among children. Tumor-associated macrophages have been found to interact with tumor cells, secreting a variety of cytokines about tumor growth, metastasis, and prognosis. This study aimed to identify macrophage-associated genes (MAGs) signatures to predict the prognosis of osteosarcoma. Methods Totally 384 MAGs were collected from GSEA software C7: immunologic signature gene sets. Differential gene expression (DGE) analysis was performed between normal bone samples and osteosarcoma samples in GSE99671. Kaplan–Meier survival analysis was performed to identify prognostic MAGs in TARGET-OS. Decision curve analysis (DCA), nomogram, receiver operating characteristic (ROC), and survival curve analysis were further used to assess our risk model. All genes from TARGET-OS were used for gene set enrichment analysis (GSEA). Immune infiltration of osteosarcoma sample was calculated using CIBERSORT and ESTIMATE packages. The independent test data set GSE21257 from gene expression omnibus (GEO) was used to validate our risk model. Results 5 MAGs (MAP3K5, PML, WDR1, BAMBI, and GNPDA2) were screened based on protein-protein interaction (PPI), DGE, and survival analysis. A novel macrophage-associated risk model was constructed to predict a risk score based on multivariate Cox regression analysis. The high-risk group showed a worse prognosis of osteosarcoma (p < 0.001) while the low-risk group had higher immune and stromal scores. The risk score was identified as an independent prognostic factor for osteosarcoma. MAGs model for diagnosis of osteosarcoma had a better net clinical benefit based on DCA. The nomogram and ROC curve also effectively predicted the prognosis of osteosarcoma. Besides, the validation result was consistent with the result of TARGET-OS. Conclusions A novel macrophage-associated risk score to differentiate low- and high-risk groups of osteosarcoma was constructed based on integrative bioinformatics analysis. Macrophages might affect the prognosis of osteosarcoma through macrophage differentiation pathways and bring novel sights for the progression and prognosis of osteosarcoma.


Introduction
Osteosarcoma, as a common malignant tumor, occurred mostly in the distal femur and proximal tibia metaphyses. Currently, the incidence of osteosarcoma worldwide was three to four per million people [1]. e main treatment for osteosarcoma consisted of chemotherapy and surgery [2]. As a highly aggressive tumor, nearly half of osteosarcoma would metastasize, and the lung was the most common metastatic site [3]. Despite a relatively low incidence of osteosarcoma, the prognosis of osteosarcoma continued to be very poor, [7]. A recent study showed that the infiltration of macrophages was related to the prognosis of breast cancer through paracrine interaction between breast cancer cells and macrophages [8]. A previous research also revealed that macrophages would promote the growth of epithelial cells with cancer-related mutations [9]. A recent study also implied that TAM prevented metastasis in high-grade osteosarcoma by collecting 132 clinical osteosarcoma samples [10]. TAM could be divided into three types based on their functional properties: M1, M2, and M0 [11]. M1 macrophages, activated by lipopolysaccharides, 1, and other cytokines, played an important role in promoting inflammation and antimicrobial [12]. Meanwhile, M2 macrophages, mainly induced by CSF-1 and IL-10, were involved in tumor progression, wound healing, tissue repairing, and inhibition of inflammation [13]. It is currently believed that M2 macrophages promoted tumor growth and metastasis while M1 macrophages inhibited tumor formation by secreting cytokines [14]. An injection of M1 macrophages into mice with Ehrlich ascites carcinoma could improve the survival rate of the mice [15]. A former study also indicated that M1 macrophages could reduce the growth of colon cancer cells [16]. Moreover, a high proportion of M2 macrophages could lead to a poor prognosis in ovarian cancer [17]. In Zhou's study, osteosarcoma metastasis could be prevented by all-trans retinoic acid through the prohibition of M2 polarization [18]. Although osteosarcoma and TAM have been studied in recent years, the TAM-related diagnosis and prognostic indicators of osteosarcoma were still rare. Furthermore, most of the current osteosarcoma indicators were based on the hematological study [19], and the influence of the tumor microenvironment on osteosarcoma was rarely considered. erefore, developing a macrophage-associated risk model to predict the prognosis of osteosarcoma was urgently needed.
erapeutically applicable research to generate effective treatments (TARGET) was a database of pediatric tumors, including acute lymphoblastic leukemia, acute myeloid leukemia, kidney tumors, neuroblastoma, and osteosarcoma. e CIBERSORT deconvolution algorithm was a machine learning method to calculate the infiltration of various immune cells based on bulk transcriptome data through linear support vector regression and highly robust to noise [20]. LM22 was a gene expression label matrix of immune cells. It consisted of 547 genes that could differentiate 22 immune cell phenotypes [21]. CIBERSORT has been widely used in predicting the proportion of immune cells in cancers. e Connectivity Map (cMap) was an instrumental online bioinformatics database that includes gene expression profiles of more than 7,000 tumor cell lines and 1,309 drugs [22]. Decision curve analysis was a statistical method to evaluate clinical prediction models, diagnostic experiments, and molecular markers [23]. Nomogram could simplify the complex prediction model in the probability estimation of the event (such as death or recurrence) based on the specific situation of each patient. Multivariate Cox regression analysis has been widely applied in clinical research [24]. Here, the clinical and transcriptome data of osteosarcoma from the TARGET database (TARGET-OS) were collected in this study. 384 MAGs were collected from GSEA software C7: immunologic signature gene sets [25,26]. e STRING database was further used to detect MAGs with a high connection [27]. en 5 MAGs were selected to construct the risk model through integrative bioinformatics analysis. Moreover, the prognostic nomogram was constructed to evaluate our risk model and further validated by a bootstrap test. Besides, the independent data set from the GEO database was collected to validate our model. In this study, we aimed to identify macrophage-associated gene signatures to predict the prognosis of osteosarcoma, which could help clinicians evaluate patients' conditions and provide novel insights for osteosarcoma. e median ages (interquartile range (IQR)) of TARGET-OS and GSE21257 were 14.5 (12.2-17.4) years and 16.7 (13.6-18.7) years, respectively. GPL10295 platform was used for the GSE21257 data set while the GPL20148 platform was used for the GSE99671data set. Robust spline normalization was performed in GSE21257 while normalization of fragments per kilobase of exon model per million mapped fragments (FPKM) was performed in the TARGET-OS data set. Gene sets involving macrophage from GSEA software C7: immunologic signature were selected by searching using the keyword "macrophage." en, a total of 384 MAGs were collected after removing overlapped genes. ese 384 MAGs were listed in Table S1. All data were normalized through the z-score method:

Materials and Methods
where Zwas the standard value; xwas the specific gene expression value; µwas the mean expression value of each sample; and σwas the standard deviation. e probe was a fluorescent-labeled nucleic acid complementary to a specific nucleotide sequence of the target gene. e RNA or cDNA fragment of the gene was captured by base complementary hybridization. Here, we used each probe to match genes. e maximum value of the probe was selected when the gene matched at least two probes. TARGET-OS was the training data set while GSE21257 was the test data set. Detailed information about patients with complete clinical information in TARGET-OS and 2 Journal of Oncology GSE21257 were presented in Tables 1 and 2. Basic information of three data sets were provided in Table 3. e flowchart of this study was shown in Figure 1.

Differential Gene Expression Analysis and Protein-Protein
Interaction Analysis. Totally 384 MAGs were used for DGE using the limma package [28] in R software. Gene signatures of normal bone samples or osteosarcoma were identified based on DGE analysis. Benjamini-Hochberg (BH) method [29] was used here to adjust multiple hypotheses. e screening criteria for significant genes were adjusted to p < 0.05. e differentially expressed genes were further transformed into GRP files and uploaded to the cMap database (http://www.broad.mit.edu/cmap/) for small molecule drug prediction analysis. Potential therapeutic targets for osteosarcoma were selected based on enrichment score and p < 0.05. Meanwhile, 384 MAGs were used for PPI analysis in the STRING database and further visualized by Cytoscape [30]. e confidence level was 0.4, and MAGs with degrees higher than 2 were selected for further analysis.

Analysis of Infiltration of Immune Cells in Osteosarcoma
Samples.
e infiltration of immune cells in TARGET-OS was calculated by the CIBERSORT deconvolution algorithm. As a part of the CIBERSORT deconvolution algorithm, the machine learning method, nu-support vector regression (]-SVR), was applied to this analysis. Different ] values including 0.25, 0.5, and 0.75 were selected to perform ]-SVR to predict the infiltration of immune cells for each sample. e solution of ]-SVR was screened based on the lowest root-mean-square error between the true and the predicted expressions. e formula of the CIBERSORT algorithm was as follows: where M ij represented the expression level of gene i in mixed sample j, which was the sum of its expression in r immune cell type. S ik was a label matrix (LM22; https://cibersort. stanford.edu/download.php), which represented the gene expression level of gene i in immune cells. F cj represented the proportion of cell types in the mixed sample j. e permutations of the signature matrix were 1,000. e ESTI-MATE package was also used to calculate the immune and stromal scores of each sample [31].

Survival Analysis.
e survival and survminer packages were used for survival analysis in R software. e collected 384 MAGs were divided into high-and low-risk groups based on their expression in the TARGET-OS data set for survival analysis. Besides, high-and low-risk groups from the TARGET-OS and the GSE21257 data sets were used for survival analysis, respectively. High/low immune and high/ low stromal score groups were also used for survival analysis.
e survival rate was compared by the log-rank test. p < 0.05 indicated statistical significance.

Construction of Multivariate Cox Regression Model.
Multivariate Cox regression model was constructed based on 5 differentially expressed MAGs from the TARGET-OS data set using survival package (https://cran.r-project.org/web/ packages/survival/index.html) and survminer package (https://cran.r-project.org/web/packages/survminer/index. html) in R software. For each MAG, the coefficient of multivariate Cox regression was regarded as the coefficients in the risk score. en the risk score of each sample was calculated. e formula was as follows: where coef was the coefficient of multivariate regression analysis of MAGs. Xwas the expression of each MAG. is   Journal of Oncology 3 model was constructed to predict risk scores in osteosarcoma samples. e risk here referred to the risk of a poor prognosis in the osteosarcoma samples. e higher the risk score, the higher the probability of the poor prognosis in osteosarcoma, and vice versa. Osteosarcoma samples were divided into high-and low-risk groups by risk score. Pheatmap package (https://cran.r-project.org/web/ packages/pheatmap/index.html) was used to display the expression of MAGs in the high-and low-risk groups in R software. en risk score and other clinical characteristics were also used for multivariate Cox regression analysis to identify potential independent prognostic factors of osteosarcoma in the TARGET-OS and the GSE21257 data sets.

Decision Curve and ROC Analyses.
e DCA curve was shown with the net benefit rate as the ordinate and the highrisk thresholds as the abscissa using the rmda package (https://cran.r-project.org/web/packages/rmda/index.html) in R software. e calculation formula of net benefit was as follows: where nwas the sample size, and p t was the threshold probability. Here, we explored three clinical prognostic models in TARGET-OS: simple clinical data model (gender, race, age, and tumor metastasis), simple MAGs model (5 MAGs and risk score), and complex model that integrated MAGs and clinical features (gender, race, age, tumor metastasis, 5 MAGs, and risk score). e model with the greatest net benefit under different high-risk thresholds was recommended for clinical applications of osteosarcoma. Meanwhile, ROC curves to predict the 1-, 3-, and 5-year survival of TARGET-OS and GSE21257 were performed using the time ROC package in R software [32].

Construction and Internal Validation of Prognostic
Nomogram.
e nomogram was constructed to predict the overall survival (1 year, 3 years, and 5 years) of patients in TARGET-OS using the rms package (https://cran.r-project. org/web/packages/rms/index.html) in R software. Here, patients' clinical characteristics such as age, gender, tumor metastasis, race, and risk score were used for the construction of a nomogram. en internal validation of nomogram was performed, and bootstrap was set to 1,000. e discrimination of nomogram was evaluated by concordance index while calibration plots of 1-, 3-, and 5-year survival curves of TARGET-OS were performed to evaluate the prediction performance of nomogram.   Journal of Oncology

Gene Set Enrichment Analysis.
All genes from TARGET-OS were used for GSEA based on low-and high-risk groups. Significant macrophage-associated pathways in the osteosarcoma microenvironment were identified using GSEA software. False discovery rate (FDR) < 0.05 and p < 0.05 indicated statistical significance.
2.9. Statistical Analysis. Statistical Product and Service Solutions software (SPSS 22.0) and R 3.6.2 were used for data analysis. A chi-square test was performed for categorical data. e independent-samples Kruskal-Wallis test was performed to compare MAGs expression between normal bone samples and osteosarcoma samples. Meanwhile, the Kruskal-Walls test was also performed to compare stromal and immune scores between high-and low-risk groups, respectively. All significance levels were p < 0.05.

Identification of 5 MAGs Related to the Prognosis of Osteosarcoma.
A total of 384 MAGs were screened and used for subsequent analysis. PPI analysis of these 384 MAGs was shown in Figure 2 e expression of gene WDR1 (p = 0.006), PML (p = 0.005), MAP3K5 (p = 0.010), and GNPDA2 (p = 0.009) were significantly higher in normal bone sample while in osteosarcoma samples, the expression of BAMBI (p = 0.006) was significantly higher. en the effect of different expression levels of these 5 MAGs on gender and age was also explored. However, the results turned out that these 5 MAGs were not significantly related to gender or age. ese 5 MAGs were used for subsequent analysis.

Construction of a Macrophage-Associated Risk Model.
e macrophage-associated risk model was constructed by 5 MAGs through multivariate Cox regression analysis in TARGET-OS. e formula of our risk model was as follows: en, osteosarcoma samples were divided into high-and low-risk groups by risk score, and the results were shown in Figures 4(a) and 4(b). As the risk score of patients increased, the number of deaths rose, and the survival time of patients decreased. Besides, our score could effectively predict the prognosis of osteosarcoma (concordance index = 0.797). erefore, in order to further explore the prognostic difference between the high-and low-risk groups, clinical information of high-and low-risk groups were used as training data for survival curve analysis. e result was shown in Figure 4(c). Compared with the high-risk group, the low-risk group had a significantly improved prognosis (p < 0.001). e 5-year survival rates of high-and low-risk groups were 43.0% and 90.1%, respectively. A heat map of the expression of 5 MAGs in TARGET-OS was displayed in Figure 4(d). Besides, different stromal and immune scores among high-and low-risk groups were explored. e results turned out that the stromal and immune scores of the lowrisk group were both significantly higher than the high-risk group (Figures 5(a) and 5(b)). e survival plot of each group with different immune and stromal scores was subsequently shown in Figures 5(c) and 5(d).
e low-risk group continued to have a significantly better prognosis (p < 0.0001), regardless of its immune and stromal scores. Since there was a significant prognostic difference between high-and low-risk groups, our risk score was considered to predict the prognosis of osteosarcoma effectively.

B cells -Naive B cells -Memory Plasma cells T cells -CD8 T cells -Naive T cells -Memory testing T cells -Memory activated T cells -Follicular helper T cells -Regulatory (Treg) T cells -Gamma delta NK cells -Resting NK cells -Activated Monocytes Macrophages -M0
Macrophages -M1  Journal of Oncology were shown in Figure 6(a). In univariate Cox regression analysis, risk score (p < 0.001) and tumor metastatic (p = 0.003) were closely related to the prognosis of osteosarcoma. Moreover, in multivariate Cox regression analysis, risk score (p < 0.001) and tumor metastatic (p = 0.001) were still related to the prognosis of osteosarcoma, indicating that tumor metastatic and our risk score could be considered independent prognostic factors of osteosarcoma.

Nomogram Effectively Predicted the Prognosis of
Osteosarcoma. Since considering both clinical features and our MAGs model had the best net clinical benefit, the nomogram was constructed by integrating gender, age, tumor metastasis, race, and risk score. e result was displayed in Figure 7(a). As the total points became higher, the 1-, 3-, and 5-year survival rate of patients decreased. e concordance index of the nomogram was 0.842, and the calibration curve of 1-, 3-, and 5-year survival of osteosarcoma (Figures 7(b)-7(d)) also illustrated our nomogram that could effectively predict the prognosis of osteosarcoma.

Validation of Our Risk Model by Independent Data
Set. e independent data set GSE21257 was used for the validation of our risk model. e GSE21257 data set was divided into high-and low-risk groups based on the macrophageassociated risk model. As shown in Figures 8(a) and 8(b), compared with the low-risk group, the high-risk group had a lower survival time.
e survival curve of different risk groups was shown in Figure 8(c). e 5-year survival rates of the high-and low-risk groups were 48.1% and 76.9%, respectively. e low-risk group had better clinical outcome (p = 0.040). e areas under the curve for prediction of 1-, 3-, and 5-year survival of osteosarcoma were 0.76, 0.72, and 0.73, respectively (Figure 8(d)). Moreover, the risk score and clinical information of GSE21257 including gender and age were also used for Cox regression analysis. e risk score was significantly correlated with the prognosis of osteosarcoma in univariate (p = 0.024) and multivariate (p = 0.020) regression analyses (Figure 9(a)). erefore, our risk score could be considered an independent prognostic factor of osteosarcoma.

Low-Risk Group Was Related to Macrophage
Differentiation Pathway. GSEA was also performed for all genes from TARGET-OS. e results were shown in Table S2 ese biological processes were considered important pathways of TAM in the osteosarcoma microenvironment. e above analysis revealed that MAGs played a significant role in osteosarcoma. Moreover, the results of characteristics of clinical information and prognosis of osteosarcoma were consistent with the result of TARGET-OS. erefore, our risk model could differentiate different risk groups successfully, and the low-risk group was correlated with a better prognosis of osteosarcoma.

Discussion
As osteosarcoma is one of the most common childhood tumors in the world, its treatment and prognosis have received widespread attention. Despite the surgical treatment along with pre-and postoperative chemotherapy, the eventfree 5-year survival rate remained low [33]. A previous study revealed that gene PPARG, gene IGHG3, and gene PDK1 were correlated with osteosarcoma [34]. However, this conclusion was not validated by the independent data set. In this study, DGE analysis was performed based on normal bone samples and osteosarcoma samples to identify differentially expressed MAGs. We further identified exisulind, atractyloside, and doxycycline as top potential drugs for the treatment of osteosarcoma. A recent study illustrated that exisulind could induce apoptosis in lung cancer by downregulating cyclic GMP phosphodiesterase and further improve the prognosis of lung cancer [35]. Atractyloside was also found to suppress the progression and metastasis of colon cancer in Lu's study [36]. A previous report also showed that doxycycline could reduce the proliferation of melanoma cells by inhibiting apoptosis [36]. Considering the important role these drugs played in other tumors, they might become promising drugs for osteosarcoma treatment. e selected 5 MAGs (MAP3K5, PML, WDR1, BAMBI, and GNPDA2) were identified as prognostic-related genes for osteosarcoma, and a novel macrophage-associated risk model was constructed based on these 5 MAGs. MAP3K5, also called ASK1, was an important kinase in the process of cell apoptosis, participating in the regulation of multiple cell-signaling pathways in inflammation and tumors [37]. A recent research indicated that MAP3K5 promoted macrophage activation and migration in mice models [38]. Furthermore, knockout MAP3K5 mice were more likely to develop colon cancer [39]. ese research works were also consistent with our study. MAP3K5 might be related to a better prognosis of a tumor by activating macrophages, which also needed further study to validate. e biological function of the gene PML was associated with its nuclear location, and PML was associated with cell cycle regulation and tumor suppression [40]. For instance, reduced expression of PML was found to promote multiple tumor   growth, including prostate adenocarcinoma, breast carcinoma, and thyroid carcinoma [41]. A previous research reported that PML was critical to the formation of macrophages [42]. Here, we reported that high expression of PML might inhibit osteosarcoma growth, which was also consistent with these findings. WDR1 took part in inducing the disassembly of actin filaments, and dysfunction of WDR1 might cause autoinflammatory disease, which in turn activated macrophage [43]. BAMBI, a pseudoreceptor of the TGF signaling pathway, played a key role in regulating macrophage polarization [44]. A former study indicated that highly expressed BAMBI could contribute to colon cancer metastasis through Wnt/beta-catenin in mice models [45]. Besides, the expression of BAMBI was significantly higher in ovarian cancer through TGF-beta signaling [46]. BAMBI was also highly expressed in pancreatic cancer by the TGFbeta pathway [47]. Similarly, our study reported that high expression of BAMBI would result in a poor prognosis of osteosarcoma, which implied that BAMBI could be a new target for the treatment of osteosarcoma. GNPDA2 was closely related to obesity and body mass index. A recent epidemiological survey also showed that people with a high body mass index were at higher risk of cancer [48]. Besides, macrophage accumulating in adipose tissue was related to obesity [49]. erefore, these genes were considered to be potential targets for the treatment of osteosarcoma, and further experiments are needed to verify the role of these genes in osteosarcoma.    Our study showed that the low-risk group had significantly higher immune and stromal scores, which was correlated with a better prognosis. Similarly, a previous study indicated that osteosarcoma samples with high immune scores had a better prognosis [34]. ROC curve also exhibited excellent accuracy of our risk model in TARGET-OS and GSE21257. e decision curve considered the clinical utility of specific models and focused on the net benefit of different clinical interventions. Here, the decision curve of three models (simple gene model, simple clinical model, and complex model) demonstrated that comprehensive consideration of genetic information and clinical information could obtain the greatest benefits. Besides, the nomogram also accurately predicted the prognosis of osteosarcoma.
erefore, our risk model could effectively predict the prognosis of osteosarcoma, which could be a potential clinical indicator of osteosarcoma.
e GSEA results of TARGET-OS showed that GO: macrophage activation, GO: macrophage migration, GO: macrophage differentiation, and GO: regulation of macrophage chemotaxis were considered key pathways in osteosarcoma. Macrophage activation and migration were correlated with tumor growth. A recent study reported that activated TAM could promote the angiogenesis of breast cancer [50]. TAM also promoted cancer development by upregulation of LAMP2a [51]. e migration of macrophages was inhibited under hypoxia, which was conducive to tumor growth [52]. Besides, macrophage migration could accelerate tumor invasion without relying on matrix metalloproteinase [53]. Macrophage differentiation also played a significant role in the tumor microenvironment. A previous study showed that M2 macrophage was closely associated with malignant oral squamous cell carcinomas [54]. Macrophage could be shifted to M1 macrophage by phenelzine in triple-negative breast cancer [55]. Moreover, the chemokine system was found to affect macrophage polarization. A former research reported that interferon gamma could induce the activation of M1 macrophage [56]. ese biological processes were closely related to TAM in tumor, which also provided a novel research perspective for the role of TAM in osteosarcoma.
Our study, however, had some limitations: (1) due to the limitation of sample size, we needed more research to support our conclusion, and (2) we also lacked prospective clinical trials to verify the performance of our model further.
erefore, we looked forward to more research on MAGs to explore novel ideas for the clinical treatment of osteosarcoma.

Conclusion
In general, 5 MAGs (MAP3K5, PML, WDR1, BAMBI, and GNPDA2) correlated with the prognosis of osteosarcoma were screened for the construction of the risk model. A novel macrophage-associated risk score to differentiate low-and high-risk groups of osteosarcoma was constructed based on multiple bioinformatics analyses. e high score indicated the poor prognosis of osteosarcoma while the low score indicated the better prognosis of osteosarcoma. Besides, our risk score was validated by the independent data set successfully, and nomogram effectively predicted the prognosis of osteosarcoma.

MAG:
Macrophage-associated gene GSEA: Gene set enrichment analysis DGE: Differential gene expression PPI: Protein-protein interaction cMap: Connectivity Map ROC: Receiver operating characteristic GEO: Gene expression omnibus TME: Tumor microenvironment TAM: Tumor-associated macrophage TARGET: erapeutically applicable research to generate effective treatment FDR: False discovery rate DCA: Decision curve analysis ES: Enrichment score.