Identification of LTF as a Prognostic Biomarker for Osteosarcoma

Osteosarcoma remains a major health problem in teenagers. However, its pathogenesis mechanism remains not fully elucidated. This study aims to identify the prognostic biomarkers for osteosarcoma. In this study, we selected genes with a median absolute deviation (MAD) value of the top 5000 in the GSE32981 dataset for subsequent analysis. Weighted correlation network analysis (WGCNA) was used to construct a coexpression network. WGCNA showed that the tan module and midnight blue module were highly correlated with origin and metastases of osteosarcoma, respectively. Enrichment analysis was conducted using genes in the tan module and midnight blue module. A gene coexpression network was constructed by calculating the Spearman correlation coefficients. Four key genes (LTF, C10orf107, HIST1H2AK, and NEXN) were identified to be correlated with the prognosis of osteosarcoma patients. LTF has the highest AUC value, and its effect on osteosarcoma cells was then evaluated. The effect of LTF overexpression on proliferation, migration, and invasion of MG63 and 143B cells was detected by the CCK-8 assay, transwell cell migration assay, and transwell invasion assay, respectively. The overexpression of LTF promoted the proliferation, migration, and invasion of MG63 and 143B cells. In conclusion, LTF may serve as a prognostic biomarker for osteosarcoma.


Introduction
Osteosarcoma is a kind of primary malignant bone tumors that prefer to occur in teenagers. It is estimated to occur in 2% and 3% cancer cases in age at birth to 14 and 15 to 19, respectively [1]. Despite the advances in therapy, the 5-year survival rate of osteosarcoma was only about 67%∼69% [1]. erefore, osteosarcoma remains a major health problem. e identification of novel biomarkers for osteosarcoma is still needed. Finding more accurate biomarkers would promote the outcome prediction and individualized therapy of osteosarcoma patients.
With the development of high-throughput detection techniques, a large number of genes have been found to be differentially expressed in osteosarcoma [2]. Namløs et al. performed microarray analysis and identified gene expression profiles in osteosarcoma samples of primary and metastatic origin [3]. Buddingh et al. identified gene signatures related to the metastases of osteosarcoma patients [4]. Some differentially expressed genes have been identified as biomarkers for osteosarcoma. For example, CBX3 and ABCA5 are identified as putative biomarkers for tumor stem cells in osteosarcoma [5]. High MMP9 expression was associated with poor overall survival of osteosarcoma patients [6]. Myc was highly expressed in human osteosarcoma cell lines and tissues. Higher Myc expression was correlated with metastasis and poor prognosis of patients with osteosarcoma [7]. However, the effect of altered genes in osteosarcoma has still not been fully addressed. Also, there is still a lack of understanding in the relationships among these genes.
is method could be also used in screening biomarkers for osteosarcoma.
In this study, osteosarcoma-related high-throughput data were retrieved from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/). e data were analyzed using bioinformatic methods. We screened 326 candidate genes related to the origin and metastasis of osteosarcoma by WGCNA. Subsequently, we identified 4 key genes correlated with the survival of osteosarcoma patients, including lactoferrin/lactotransferrin (LTF), C10orf107, histone cluster 1 H2ak (HIST1H2AK), and nexilin F-actin binding protein (NEXN). Furthermore, the effect of LTF on proliferation, migration, and invasion of osteosarcoma cells was confirmed.

Gene Expression Data Collection.
We searched the osteosarcoma-related datasets in the GEO database (https:// www.ncbi.nlm.nih.gov/gds/). Two datasets were involved in this work. e dataset GSE32981 from GPL3307 was used as the training dataset. e dataset GSE21257 from the GPL10295 platform was used as the testing dataset.

WGCNA.
e median absolute deviation (MAD) value was used to screen the top 5000 genes from the GSE32981 dataset. A scale-free coexpression network was established using the WGCNA package in R software [8].

Coexpression Network Establishment.
Spearman correlation coefficient (SCC) values of gene pairs were calculated. Gene pairs with |SCC| > 0.6 were involved in the coexpression network. Cytoscape software was used to screen the top ten nodes with the highest degree in tan and midnight blue modules, respectively.

e Prognostic Value of Genes.
e prognostic value of hub nodes in the constructed coexpression network was calculated. Kaplan-Meier analysis of these hub nodes was conducted using data in GSE21257. Receiver operating characteristic (ROC) curves are used to describe the prediction accuracy of the hub nodes. e area under the ROC curve (AUC) was analyzed.

CCK-8 Assay.
e proliferation of MG63 and 143B cells was detected before and 1, 2, 3, and 4 days after transfection using a CCK-8 assay kit (Jiangsu KeyGEN BioTECH Co. Ltd, Cat. No. KGA317). MG63 and 143B cells were cultured with the CCK-8 reagent for 4 h, and then the optical density was detected at 450 nm by using a microplate reader ( ermo Fisher Scientific, Multiscan MK3).

Cell Migration and Invasion Assay.
e migration and invasion ability of MG63 and 143B cells was detected by the transwell cell migration assay and transwell cell invasion assay, respectively. e difference between the two experiments is that, for the transwell cell invasion assay, the upper chamber was coated with Matrigel (BD Biosciences). In detail, the upper chambers of the transwell plate (BD Biosciences) contain a culture medium without FBS, and the lower chambers contain a complete culture medium. Fortyeight hours after incubation, invasion cells were fixed with 4% paraformaldehyde and stained with crystal violet. After washed with PBS, the images of cells were captured. Data were analyzed using ImageJ software.

Statistical
Analysis. Statistical analysis was conducted using GraphPad Prism 8 and R version 3.6.1. Data were presented as mean ± standard deviation. Comparison between two groups was performed using Student's t-tests. P < 0.05 was considered statistically significant.

Ethics Statement.
is work did not involve human participants or animals.

Results
network conform to the characteristics of the scale-free network, we performed the screening with a soft threshold. In this study, the scale-independent value reached 0.9 when the power value β was set to 7, and the average connectivity ratio was low (Figures 1(b) and 1(c)). erefore, the coexpression matrix was calculated under the condition of determined β � 7. e cluster of module eigengenes is shown in Figure 1(d).

Identification of Key Modules in Osteosarcoma.
Similar clusters were combined into a new module using two settings: height � 0.2 and min module size � 50. Eighteen modules with similar patterns of connected genes were obtained (Figure 2(a)). rough the correlation study of the network heatmap plot (Figure 2(b)), there was little correlation among the 18 modules. e adjacency relationship between gene and gene in the module is shown in Figure 2(c). e relevance between the 18 modules and characteristics (origin, age, gender, and metastases) of osteosarcoma samples was evaluated based on module-trait relationships (MTRs). e tan module included 195 genes, which showed the highest positive correlation with origin. e midnight blue module included 131 genes, which showed the highest negative correlation with metastases ( Figure 2(d)). e tan module was correlated with cancer origin (primary or metastasis). e midnight blue module was correlated with cancer metastasis. erefore, the tan module and midnight blue module were selected for further analyses.

Enrichment Analysis of Genes Involving in Tan Module and Midnight Blue Module.
e enrichment analysis was performed using genes involving in the tan module and midnight blue module, respectively. e results showed that genes involving in the tan module were mainly enriched in biological processes including "extracellular structure organization," "regulation of peptidase activity," and "actin cytoskeleton organization" (Figure 3(a)). Also, genes involving in the midnight blue module were mainly enriched in biological processes including "DNA damage/telomere stress-induced senescence," "metalloprotease DUBs," and "negative regulation of chromosome organization" (Figure 3(b)).

Coexpression Network of Genes Involving in the Modules.
A coexpression network was constructed using genes involving in the tan module and midnight blue module based on the gene coexpression relationship. Spearman correlation coefficient was calculated using genes in the tan module and midnight blue module. 177 gene pairs with |SCC| > 0.6 were screened to construct the coexpression network, including 160 nodes and 177 edges ( Figure 4).

Validation of Key Genes in the Coexpression Network.
Subsequently, we analyzed the topological properties of the established coexpression network. e top 10 hub nodes with high degree involved in the tan module and midnight blue module are shown in Tables 1 and 2, respectively. Kaplan-Meier analysis of these key genes was performed using the GSE21257 dataset. LTF in the tan module was correlated with the prognosis of osteosarcoma patients ( Figure 5(a)). C10orf107, HIST1H2AK, and NEXN in the midnightblue module were correlated with the prognosis of osteosarcoma patients (Figures 5(b)-5(d)). e ROC of specificity and sensitivity was analyzed, and the area under the curve (AUC) was then computed. e AUC of LTF, C10orf107, HIST1H2AK, and NEXN was 0.646, 0.583, 0.565, and 0.584, respectively (Figures 5(e)-5(h)). Among these genes, the AUC of LTF was the highest, which indicates its effectiveness as a prognostic biomarker in osteosarcoma. In addition, two independent datasets (GSE36001 and GSE99671) from the GEO database were used to compare the LTF expression in osteosarcoma samples and normal controls. e results showed that the expression of LTF was lower in osteosarcoma samples compared with that in normal controls ( Figure S1). e CCK-8 assay showed that the OD value in the LTF overexpression group was lower than that in the pcDNA3.1-transfected group (Figures 6(a) and 6(c)). erefore, the overexpression of LTF inhibited the proliferation of MG63 and 143B cells (Figures 6(b) and 6(d)). Transwell migration and invasion assays showed that the overexpression of LTF inhibited the migration and invasion ability of MG63 and 143B cells (Figure 7).

Discussion
Osteosarcoma is the commonest primary bone malignant tumor with a high rate of metastasis [13]. ere is still a need for specific prognostic biomarkers and drug targets for osteosarcoma. Identifying specific prognostic biomarkers may contribute to the clinical management of osteosarcoma.
In the present study, we collected osteosarcoma-associated gene profiles in the GEO database and screened the top 5000 genes from GSE32981 using MAD. A weighted gene coexpression network was constructed, and the modules of this network were calculated by TOM. e relevance between eigengenes of 18 modules and origin, age, gender, and metastases status of osteosarcoma samples was calculated.
e results showed that the tan module and midnight blue module are correlated with origin and metastases of osteosarcoma, respectively. e origin and metastases status of osteosarcoma were related to the response to therapies and may influence clinical therapeutic decision making [14,15]. erefore, these two modules were selected for further investigation.
We then performed enrichment analysis using Metascape. Genes involving in the tan module were mainly enriched in osteosarcoma-related biological processes including "regulation of peptidase activity" [16][17][18], "phenylalanine metabolism" [19,20], and "regulation of growth" [21,22]. Also, genes involving in the midnight blue module were mainly enriched in osteosarcoma-related biological processes, including "DNA damage/ telomere stress-induced senescence" [23][24][25], "metalloprotease DUBs" [26][27][28], and "interferon alpha/beta signaling" [29][30][31]. In addition, the correlation among these genes was calculated by Spearman correlation analysis. Gene pairs with |SCC| > 0.6 were identified to be screened to construct the coexpression network with 160 nodes and 177 edges. e hub nodes with high degree from the network were identified in the tan module and midnight blue module. Kaplan-Meier analysis of these key genes was performed. Among the 29 genes, four of them was correlated with the prognosis of osteosarcoma patients. C10orf107, HIS-T1H2AK, and NEXN were from the midnight blue module, and LTF was from the tan module. e expression of C10orf107 and NEXN was positively correlated with poor survival of osteosarcoma patients, whereas the expression of HIST1H2AK and LTF was negatively correlated with poor survival of osteosarcoma patients. e effect of C10orf107 in cancer was poorly understood, and only a rare translocation t(3; 10) (q26; q21) was observed in an acute myeloid leukemia patient, presented as a fusion of MECOM (chromosome 3q26.2) and C10orf107 (chromosome 10q21.2) [32]. NEXN (encode Nexilin) is related to cardiovascular diseases, including hypertrophic cardiomyopathy, coronary artery disease, and septal defects [33][34][35]. However, its role in cancer has not been reported.
LTF has been considered as a tumor suppressor in multiple cancers. LTF is deficient or lowly expressed in prostate cancer, nasopharyngeal carcinoma, oral squamous cell carcinoma, etc. [36][37][38]. Osteosarcoma patients with lower LTF have a poor survival rate compared with those with higher LTF ( Figure 5). However, its effect on osteosarcoma is still not known. We then detected its effect on  Journal of Oncology   e results showed that the overexpression of LTF inhibited the proliferation of osteosarcoma cells ( Figure 6). Moreover, the upregulation of LTF also promoted the migration and invasion of osteosarcoma cells (Figure 7). ese results were in accordance with the view that LTF is a tumor suppressor. LTF has inhibitive effects on cancer progression. Xiao et al. revealed that the growth inhibitory effects of LTF were through a p27/cyclin E-dependent pathway in head and neck cancer cells [39]. Similarly, Zhang et al. found that lactoferrin exerts an inhibitive effect on breast cancer cell growth through inducing cell cycle arrest, with little effect on normal breast cancer cells [40]. LTF also showed migration and/or invasion inhibitory effect in different types of cancer cells.
LTF could inhibit the migration of colorectal adenocarcinoma cells (Caco-2) and gastric adenocarcinoma cells (AGS) [41]. LTF suppressed and even reversed epithelialto-mesenchymal transition process in oral squamous cell carcinoma [42] and glioblastoma [43]. Moreover, the administration of LTF inhibited the liver and spleen metastasis of L5178Y-ML25 cells and lung metastasis of L5178Y-ML25 cells [44]. A recent study found that LTF deficiency enhanced lung metastasis of melanoma in an LTF KO mouse model, which was related to the enhancing of the TLR9 pathway [30].
In summary, osteosarcoma-related datasets from the GEO database had been systematically analyzed. WGCNA showed that the tan module and midnight blue module are highly correlated with origin and metastases of osteosarcoma, respectively. Enrichment analyses showed that genes in these two modules were associated with cancer-related pathways. A gene coexpression network was constructed using these genes, and the key genes were identified with a high degree. Four key genes (C10orf107, HIST1H2AK, NEXN, and LTF) were found to be correlated with the prognosis of osteosarcoma patients. LTF has the highest AUC value, and its inhibitive effect on osteosarcoma cells was validated. Still, further studies are needed to reveal the precise effect and mechanism of these genes.

Conclusions
e key gene we identified, such as LTF, could be a prognosis biomarker and therapeutic drug target for osteosarcoma. Our study brought new insights into the investigation of osteosarcoma.
Data Availability e data that support the findings of this study are available from the corresponding author upon request.

Additional Points
Key Messages. (1) e tan module and midnightblue module are with the highest correlation with origin and metastases of osteosarcoma, respectively. (2) Overexpression of LTF promoted the proliferation, migration, and invasion of osteosarcoma cells. (3) Key genes including LTF, C10orf107, HIST1H2AK, and NEXN may serve as prognostic biomarkers for osteosarcoma.  Figure