An External-Validated Prediction Model to Predict Lung Metastasis among Osteosarcoma: A Multicenter Analysis Based on Machine Learning

Background Lung metastasis greatly affects medical therapeutic strategies in osteosarcoma. This study aimed to develop and validate a clinical prediction model to predict the risk of lung metastasis among osteosarcoma patients based on machine learning (ML) algorithms. Methods We retrospectively collected osteosarcoma patients from the Surveillance Epidemiology and End Results (SEER) database and from four hospitals in China. Six ML algorithms, including logistic regression (LR), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), random forest (RF), decision tree (DT), and multilayer perceptron (MLP), were applied to build predictive models for predicting lung metastasis using patient's demographics, clinical characteristics, and therapeutic variables from the SEER database. The model was internally validated using 10-fold cross-validation to calculate the mean area under the curve (AUC) and the model was externally validated using the Chinese multicenter osteosarcoma data. Relative importance ranking of predictors was plotted to understand the importance of each predictor in different ML algorithms. The correlation heat map of predictors was plotted to understand the correlation of each predictor, selecting the 10-fold cross-validation with the highest AUC value in the external validation ROC curve to build a web calculator. Results Of all enrolled patients from the SEER database, 17.73% (194/1094) developed lung metastasis. The multiple logistic regression analysis showed that sex, N stage, T stage, surgery, and bone metastasis were all independent risk factors for lung metastasis. In predicting lung metastasis, the mean AUCs of the six ML algorithms ranged from 0.711 to 0.738 in internal validation and 0.697 to 0.729 in external validation. Among the six ML algorithms, the extreme gradient boosting (XGBoost) model had the highest AUC value with an average internal AUC of 0.738 and an external AUC of 0.729. The best performing ML algorithm model was used to build a web calculator to facilitate clinicians to calculate the risk of lung metastasis for each patient. Conclusions The XGBoost model may have the best prediction effect and the online calculator based on this model can help doctors to determine the lung metastasis risk of osteosarcoma patients and help to make individualized medical strategies.


Introduction
Osteosarcoma, the most common malignant bone tumor in children and adolescents, had an incidence of approximately 0.8-11/100,000 in people aged 15 to 19 years [1,2]. Osteosarcoma usually occurs during a period of rapid bone growth . It is most commonly observed in the bones of the extremities [3], which is characterized by bone morbidity, such as pain and swelling . And it increases with age [1]. Patients who got metastatic osteosarcoma had a very poor prognosis, with only about 20% to 30% of them having longterm survival, while this proportion increased to 65% to 70% with nonmetastatic osteosarcoma [4,5]. Of all metastatic sites, the most common site is the lung (85% to 90%), followed by bone metastases (8% to 10%) [6,7]. Lung metastases contribute to the poor prognosis of most osteosarcoma patients, even after complete resection of the primary tumor. e treatment of patients with metastatic osteosarcoma remains controversial, and the majority of clinical trials excluded patients with metastatic osteosarcoma, resulting in inconsistent treatment modalities [3,8,9].
us, considering the dramatic impact of lung metastases on survival and treatment options for osteosarcoma patients, identifying osteosarcoma patients at a higher risk for lung metastases would have strong clinical implications.
Machine learning (ML), a form of artificial intelligence model, is widely used in healthcare data analysis [10][11][12][13][14][15][16]. By leveraging the powerful predictive capabilities of ML algorithms, clinical prediction models are superior to those developed by traditional statistical approaches [17][18][19][20]. Consequently, it is necessary to create new novel prediction models to better predict risk among osteosarcoma patients. With the clinical prediction models, clinicians are capable of assessing the risk of lung metastasis for each osteosarcoma patient and developing individual therapeutic strategies, such as adjuvant therapy and further optimizing treatment regimens [21]. However, there are no ML models to predict the risk of lung metastasis in osteosarcoma are available [22].
In this study, byusing patient's demographic, pathological, and clinical characteristics, we aimed to develop an ML-based model to predict the susceptibility of lung metastases in osteosarcoma patients. en, the model was externally validated with data from four hospitals in China. Finally, the ML algorithm possessing the strongest predictive power was visualized and dynamized by a web-based calculator. As a tool for prediction, it could help doctors to determine the lung metastasis risk of osteosarcoma patients and make individualized medical strategies. It ultimately provided a basis for future treatment and prevention strategies.

Study Populations and Design.
We retrospectively collected data from the SEER database and four hospitals in China including the Second Affiliated Hospital of Jilin University, the Second Affiliated Hospital of Dalian Medical University, Liuzhou People's Hospital, and Xianyang Central Hospital. Although the low incidence of osteosarcoma makes it very difficult to study large samples of patients, the Surveillance, Epidemiology, and End Results (SEER) database provides favorable resources for investigating rare malignancies in the settings where prospective data or clinical trials are limited. us, we used this common database to analyze rare cancers [23].
Patients with osteosarcoma diagnosed between 2010 and 2016 from the SEER database were used as the training cohort. e inclusion criteria were as follows: (1)  Osteosarcoma patients from the four medical institutions in China between 2010 and 2018 were used as the validation cohort. Patients were included if the diagnosis of osteosarcoma was pathologically confirmed and patients did not have other primary tumors. Patients were excluded if there were missing data or if the follow-up was less than two years. e follow-up deadline was December 1, 2020. At each institution, patients were followed up for at least two years and clear clinical pathological and follow-up information was recorded. Information retrieved included patient demographics (race, gender, and age at diagnosis), tumor characteristics (primary site, grade, laterality, T stage, N stage, lung metastases, and bone metastases), and follow-up data for treatment (surgery, chemotherapy, and radiation therapy).

Definition of Predictive Variables.
All potential predictors were standardized in the study. ere were three categories of race in the SEER data, white, black, and other, and other did not have a specific ethnicity. So the multicenter data from China were all classified as other. Treatment modalities included surgery, chemotherapy, and radiotherapy, and they were categorized as "No" or "Yes." All potential predictors included race ( ). T indicates primary tumor, TX means the primary tumor is unknown, T0 represents no evidence of primary tumor, T1 means tumors are confined to the bone cortex, and T2 means tumor exceeds the bone cortex. N is regional lymph node metastasis: Lymph nodes in NX area are unknown, N0 tumors have no regional lymph node metastasis, and N1 tumors have regional lymph node metastasis. Survival time was defined as the time interval between the surgery date and death date.

Development and Validation of Prediction Models.
ML algorithms outperform traditional regression methods when it comes to predicting outcomes [12,18,[24][25][26]. is study used six machine learning algorithms to build the models: logistic regression (LR), gradient boosting machine (GBM), extreme gradient boosting (XGBoost), random forest (RF), decision tree (DT), and multilayer perceptron (MLP). XGBoost is an integration algorithm based on boost. It is typical of the integration of cart tree, which is an improvement of the gradient tree boosting. During training, the training cohort internal validation method uses 10-fold cross-validation to evaluate the predictive power of each machine learning classifier in plotting the average AUC.
Using the validation cohort, six machine learning models ROCwere plotted and AUCswere calculated to evaluate the predictive ability of the models in different cohorts. In the performance comparison of machine learning algorithms, the AUC iscloser to 1, the better the classification model . Subsequently, based on the best predictive ability model, we created an online risk calculator that can make predictions using newly entered data of patients with osteosarcoma, thus enabling clinicians to easily and more accurately predict the risk of lung metastasis in these patients. Using the permutation importance principle, the results of 100 independent training simulations were created to assess the importance of the predictors for each ML model predicting lung metastasis. A correlation heat map of the predictors was created to assess the correlation of each predictor.

Statistical Analysis.
We extracted data from the SEER database using SEER * STAT (8.3.5) software. Baseline characteristics of the training cohort and validation cohort were compared using chi-square tests and independent samples t-tests. Univariate logistic regression analysis was performed to assess risk factors predicting lung metastasis in the training cohort of patients with osteosarcoma. Predictors with P < 0.05 in the results of the univariate logistic were included in the multivariate logistics regression analysis. Results with P < 0.05 as an independent risk factor were included in the predictive model of the machine learning algorithm. A backward stepwise selection method was used to calculate the dominance ratio (OR) with a confidence interval (CI) of 95%. Statistical analyses were performed using R software (version 4.1.1). Machine learning models and web calculators were built using Python (version 3.8). P < 0.05 was considered statistically significant.

Baseline Patient Characteristics.
After inclusion and exclusion, a total of 1201 patients were included. ere were significant differences between the training and validation groups in terms of race and duration of radiotherapy (P < 0.05). e ethnic composition of the patients from the Chinese multicenter was Chinese, which was categorized as "other" in the SEER database. Also, a higher proportion of patients from China were treated with chemotherapy. e remaining parameters: lung metastasis, age, survival time, gender, site of origin, grade, T stage, N stage, surgery, radiotherapy, and bone metastasis, were not statistically significant (Table 1). Notably, of all enrolled patients from SEER database, 17.73% (194/1094) developed lung metastasis. Among all patients from the multicenter analysis, 18.69% (20/107) had lung metastasis.
ere were statistically significant differences in gender, T stage, N stage, surgery, radiotherapy, bone metastases, and survival time between patients with and without lung metastases at baseline, with no statistical differences in the remaining variables (Table 2). In detail, patients with lung metastases had a higher proportion of males, higher T stage and N stage grades, higher use of radiotherapy and bone metastases, and shorter survival time ( Table 2).

Univariate and Multivariate Logistic Regression.
Univariate logistic regression analysis identified six risk factors associated with lung metastases, including gender, N stage, T stage, surgery, radiotherapy, and bone metastases ( Table 3). According to the multivariate logistic regression analysis, the results showed that gender, N stage, T stage, surgery, and bone metastasis were independent risk factors for lung metastasis. Among them, the female was an independent protective factor for lung metastasis, and T stage (T2, T3, TX), N stage (N1, NX), failure to undergo surgery, and bone metastasis were independent risk factors for lung metastasis.

Performance of the Machine Learning Algorithm.
e machine learning algorithm's performance was validated in the training set with 10-fold cross-validation, and the results were shown in Figure 1. It showed that the XGBoost model exhibited the highest performance in predicting lung metastasis with a p-average of 0.738. e external validation results of the model using the validation set were shown in Figure 2, which showed that the XGBoost model still showed the highest performance in predicting lung metastasis in the external data cohort with AUC = 0.729. erefore, we chose the XGBoost model as the final prediction model. Figure 3 showed the relative importance of variables in each of the lung metastasis prediction ML algorithms. We could observe a trend in the prediction variables: although the importance of the variables varied slightly among the different ML algorithms, surgery was in the first place in five    algorithms, and T stage and bone metastasis were also in the top three among the five algorithms. In contrast, sex was the last among the five algorithms. e importance of the highlevel variables in the XGBoost model was ranked in descending order as follows: surgery, T stage, bone metastases, N stage, and sex. Figure 4 showed the correlation of variables in the lung metastasis prediction ML algorithms. We could observe that there was no clear positive correlation for all variables. Surgery had a significant negative correlation with three variables: T stage, N stage, and bone metastases.

Web-Based Calculator.
A web-based calculator was built based on the most predictive XGBoost algorithm for clinicians to predict the risk of lung metastasis in osteosarcoma patients (https://share.streamlit.io/liuwencai123/os_lm/ main/os_lm.py) ( Figure 5). is calculator was easy to use and doctors could calculate the probability of developing lung metastasis for each osteosarcoma patients simply by entering easily available preoperative and intraoperative clinicopathological variables. e probability would automatically present by clicking the "predict" button.

Discussion
Metastasis from sarcoma is confined to the lung, and metastasectomy is an important component of the management of sarcoma. A study found that 81% had lung metastases and 62% had only lung metastases among 202    Computational Intelligence and Neuroscience sarcoma patients [8]. is study developed and validated several machine learning algorithms to predict lung metastasis in osteosarcoma patients. e results showed that the XGBoost model had the best predictive power in both internal and external validation. To make the clinical application of this model feasible, we built a web calculator to visualize the model for estimating the individual probability of lung metastasis in each osteosarcoma patient. is MLbased model can guide clinicians to target each patient's treatment plan, making precision medicine possible. e proportion of male patients was slightly higher than that of female in both the US SEER data and the Chinese multicenter cohort. e results of the logistics analysis showed that the risk of lung metastasis was 0.58 times lower in female patients than in male patients. To our knowledge, this was the first study to focus on the effect of gender on lung metastasis from osteosarcoma. One study found that the mean age of the onset of osteosarcoma was 10 to 14 years for women and 15 to 19 years for men. We, therefore, speculated that differences in sex hormone levels during the development of secondary sexual characteristics might contribute to the differences in tumor aggressiveness. e multiple logistic regression analysis showed that the risk of lung metastasis was much higher in T2, T3, and TX than in T1, and the risk increased with a larger volume. Previous studies have shown that patients with smaller osteosarcoma had better survival expectations. A larger tumor volume means that the tumor had a longer growth cycle, and the tumor was more aggressive and invasive and was therefore prone to lung metastases. Tumor size also  Computational Intelligence and Neuroscience influenced treatment strategies, with a correlation heat map showing a negative correlation between T and surgery. Larger tumor volumes were challenging for the surgeon since the likelihood of complete resection of the tumor was declining. In N stage, patients with NX stage were significantly more likely to develop lung metastasis than patients with other stages. e proportion of patients with definite lymphatic metastases (N1) was low in both the training and validation cohorts, neither exceeding 5%. However, the proportion of N1 versus NX was higher in patients presenting with lung metastases than in the nonmetastatic group.
erefore, we believed that having lymphatic metastases indicated that the osteosarcoma was very aggressive. One study found that patients presenting with lymphatic local metastases or distal metastases had significantly lower survival rates than other patients [24,25]. e majority of osteosarcoma patients who suffered from mortality were mainly due to lung metastases. Osteosarcoma was relatively rare in general nonspecialized bone oncology specialties, and osteosarcoma presenting with lymphatic metastases was even rarer. Considering the correlation of lymphatic metastasis with lung metastasis, examination of lymphatic metastasis could not be ignored by clinicians.
Bone metastases were also not common among osteosarcoma and did not exceed 5% in either cohort. However, of the 55 patients who presented with bone metastases in this study, 35 had concomitant lung metastases. Bone metastases were a manifestation of multimetastatic disease, and patients presenting with isolated bone metastases at presentation were rare. us, we recommended that patients with bone metastases or multifocal osteosarcoma should be further examined for lung metastases. Patients without lung metastases underwent surgery in 85.5%, much higher than the 61.7% of patients with lung metastases. When the tumor was considered unresectable or difficult to resect, namely, when the T or N stage was advanced, chemotherapy or radiation therapy was first recommended, followed by periodic reassessment of tumor resectability. Regarding some cases, surgical resection was extremely challenging for the surgeon, but every effort should still be made to pursue surgical opportunities [27][28][29].
To our knowledge, this study was the first study of attempting to predict osteosarcoma lung metastasis using machine learning algorithms. Besides, this study was also the first multicenter osteosarcoma study to use both the US SEER database and data from multiple medical centers in China. Some previous prediction models for osteosarcoma based on the SEER database have developed based on the SEER database alone, and it was not clear whether they could be used in different regions [30][31][32]. Also, all of these studies used only the nomogram as a visual prediction model and did not provide a dynamic prediction model, and they had some drawbacks in terms of convenience. More importantly, most studies on prediction models for osteosarcoma patients were single-center studies without external validation in different patient cohorts, and validity, clinical utility were greatly compromised [33][34][35]. erefore, we collected data on osteosarcoma patients from four medical centers in different regions of China as a validation group to validate the model's predictive power and its value for use in different regions. Furthermore, we built a web calculator based on the XGBoost algorithm model, which had the best ability to predict the risk of lung metastasis to increase the clinical utility of the model. Clinicians were capable of calculating the risk of lung metastasis for each patient with osteosarcoma and thus personalizing their treatment plans.
However, despite our best efforts to improve it, this study still had limitations. First, retrospective studies might lead to data bias. Second, although we externally validated the model using different patient cohorts, prospective studies were needed to determine whether it improved patient outcomes.
ird, the information currently available in SEER's clinical database was somewhat limited, and many more details such as specific protocols for surgical margins and radiotherapy were not available, which would further improve the predictive power of the model if these data were included in the model.

Conclusions
rough multiple logistic regression analysis, we have showed that sex, N stage, T stage, surgery, and bone metastasis were all independent risk factors for lung metastasis. e mean AUCs of the six ML algorithms ranged from 0.711 to 0.738 in internal validation and 0.697 to 0.729 in external validation. Among the six ML algorithms, the XGBoost showed the best performance with an average internal AUC of 0.738 and an external AUC of 0.729.
e XGBoost model may have the best prediction effect and the online calculator based on this model can help doctors to determine the lung metastasis risk of osteosarcoma patients and help to make individualized medical strategies.
Data Availability e data that support the findings of this study are available from SEER registry but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. All multicenter data generated or analyzed during this study are included in this published article.

Ethical Approval
is study was exempted from Institutional Review Board approval, in view of the SEER's use of unidentifiable patient information. Due to the strict register-based nature of the study, informed consent was waived. e study of multicenter data was approved by the ethics review committee of the Second Affiliated Hospital of Jilin University, the Second Affiliated Hospital of Dalian Medical University, Liuzhou People's Hospital, and Xianyang Central Hospital (No. 20210021).

Consent
Not applicable.

Conflicts of Interest
e authors declare that they have no competing interests.

Authors' Contributions
WLL and WCL have contributed equally to this work. CLY, SBS and QL designed the article. FHM and BW collected and evaluated the data . WLL and WCL wrote the first draft of the manuscript. All authors reviewed the manuscript. All authors contributed to the interpretation of the results, WLL and CLY wrote the final draft of the manuscript, CX and STD read and approved the final version of the manuscript.