Machine Learning of Dose-Volume Histogram Parameters Predicting Overall Survival in Patients with Cervical Cancer Treated with Definitive Radiotherapy

Purpose To analyze the effects of dosimetric parameters and clinical characteristics on overall survival (OS) by machine learning algorithms. Methods and Materials 128 patients with cervical cancer were treated with definitive pelvic radiotherapy with or without chemotherapy followed by image-guided brachytherapy. The elastic-net models with integrating DVH parameters and baseline clinical factors, only DVH parameters and only baseline clinical factors were constructed in 5-folds cross-validations for 100 iteration bootstrapping, and then were compared using concordance index (C-index) criteria. Finally, the selected important factors were used to build multivariable Cox-pH models for OS and also shown in nomograms for clinical usage. Results The median OS occurred was 25.78 months with 25 (19.53%) deaths. The elastic-net models integrating clinical and DVH factors had the best prediction performances (C-index 0.76 in the train set and C-index 0.74 in the test set). Three important factors were selected, including baseline hemoglobin level as the protective factor, primary tumor volume (GTV_P) volume, and body V5 as the risk factors. The final multivariable Cox-pH models were constructed using these important factors and had prediction performance (C-index: 0.78, 95%CI: 0.73–0.81). Conclusions This is the first attempt to establish elastic-net models to study the contributions of DVH parameters for predicting OS in patients with cervical cancer. These results can facilitate individualized tailoring of radiation treatment in cervical cancer patients.


Introduction
Cervical cancer is the fourth most frequently diagnosed cancer and the fourth leading cause of cancer death in women, with an estimated 604,000 new cases and 342,000 deaths worldwide in 2020 [1]. Multidisciplinary management planning based on the tumor size and extension made by a multidisciplinary tumor board before the start of any treatment is recommended by European Society for Medical Oncology (ESMO) guideline of cervical cancer [2]. For International Federation of Gynecology and Obstetrics (FIGO) stage IA1 to IB1, surgery is the main treatment with adjuvant radiotherapy (RT) ± chemotherapy in case of risk factors, and for the FIGO stage IB2-IVA, concurrent chemoradiotherapy (CCRT) represents the standard [2]. Definitive RT ± chemotherapy can also be used for patients with the FIGO stage IVB with oligometastasis [3] or who are not candidates for hysterectomy. Neoadjuvant chemotherapy remains controversial for locally advanced cervical cancer [2]. With regard to immunotherapy, in addition to its applications in recurrent or metastatic cervical cancer [4,5], ongoing trials are investigating the combination of immunotherapy with RT or CCRT in locally advanced cervical cancer [5,6]. Despite modern advances in various treatment modalities, the mortality of cervical cancer still remains high, with 5-year overall survival (OS) of about 65% after CCRT [7]. erefore, it is crucial to identify prognostic factors to tailor personalized management strategies for patients with different risk levels. e FIGO staging system has been the most commonly used method to classify the prognosis of cervical cancer patients. e 5-year OS rates of FIGO stage I-IV cervical cancer were 83%-100%, 70%-80%, 42%, and 32%, respectively [8,9]. Over the years, researchers had made a tremendous effort to identify other clinical prognostic factors for OS [7]. High body mass index (BMI>25) at the time of cancer diagnosis was found to be positively associated with 2-year and 5-year survival rates [10]. Biological parameters, including pretreatment levels of hemoglobin, leucocyte, and platelet, were identified as prognostic factors for locally advanced cervical cancer [11]. A previously developed nomograms with C-index of 0.713 identified several prognostic factors associated with OS, including squamous cell carcinoma antigen (SCC-Ag), BMI, tumor size, pelvic wall involvement, and para-aortic lymph node metastasis [12]. Concurrent chemotherapy (≥4 cycles) [13], monocyte [14], age [15], and performance status were also found to be prognostic for survival. Another nomogram showed tumor size, grading, and parametria status affected 5-year OS in locally advanced cervical cancer primarily treated with neoadjuvant chemotherapy followed by radical surgery [16]. Nevertheless, the abovementioned prognostic factors mainly describe clinical features other than radiotherapy parameters. Since radiotherapy forms the backbone of cervical cancer treatment, it is reasonable to presume that dose-volume histogram (DVH) parameters may have an impact on OS. Analysis of dose-effect relationship between DVH parameters and prognosis for cervical cancer patients suggested that 100%, 98%, and 90% of high-risk clinical target volume received radiotherapy dose (HR-CTV D100, HR-CTV D98, and HR-CTV90) were independent factors affecting OS [17]. Retrospective DVH analysis showed that the equivalent dose in 2 Gy (EQD2) of HR-CTV D90 was significant determinant of OS in patients with uterine cervical cancer [13]. In addition to target volume, the prognostic impact of DVH parameters of organs at risk (OARs) has been studied in a range of cancer types. Multivariate analyses showed that lung V20 (volume covered by radiation dose of ≥ 20 Gy) and lung V5 (volume covered by radiation dose of ≥ 5 Gy) were associated with OS in patients with esophageal cancer treated with neoadjuvant chemoradiotherapy when adjusting for surgical margin and pathological treatment response [18]. To the best of our knowledge, there are no studies on the effect of DVH parameters of both tumor and OARs on OS during external beam radiotherapy (EBRT) of cervical cancer.
Different approaches can be employed to identify clinical and dosimetric parameters that affect the patient's outcome. A multidimensional nomogram has been developed for predicting progression-free survival (PFS) in patients with locoregionally advanced nasopharyngeal carcinoma [19]. e random survival forest model identified D99 (the dose that covered 99% of the volume) as an important variable associated with survival of high-grade glioma [20]. Besides the random survival forest model, the elastic-net model, as a machine learning method, yields higher discriminative performance in (chemo) radiotherapy outcome than other studied classifiers [21]. erefore, in this study, we employed an elastic-net model to determine the key clinical and DVH parameters in predicting survival outcome of cervical cancer patients.

Description of Cohorts.
e local institutional ethics committee approved the study (reference number (2019) 049). All patients provided written informed consent for the use of personal medical records for academic purpose before treatment and consent form for this specific study was waived.
A cohort of patients diagnosed with cervical cancer in a single institute in China from January 2015 to February 2021 was selected for this study. All patients were treated with definitive radiotherapy. Eligible patients met the criteria: ≥18 years old; previously untreated, pathologically confirmed cervical carcinoma; stage IB-IVB using FIGO (2018) (only stage IVB with oligo-metastasis scheduled for radical chemoradiotherapy included); main treatment was EBRT with or without chemotherapy followed by image-guided brachytherapy. Key exclusion criteria were the following: small cell carcinoma of the cervix, acquired immune deficiency syndrome, concomitant secondary primary malignancies, radiotherapy in adjuvant or recurrent settings, or patients who did not complete planned radiotherapy. e last followup time was 30 April 2021.

Radiation erapy.
All patients received EBRT using RapidArc or three-dimensional conformal radiotherapy (3D-CRT) techniques. EBRT was delivered on a 6 MV linear accelerator. For RapidArc, GTV_P, and GTV_N were defined as primary gross tumor volume and locoregional pathological lymph nodes detected by physical examination, pelvis magnetic resonance imaging (MRI), or positron emission tomography (PET)/CT. PTV_4500 and PTV_5500 (planning target volume of pelvis and metastatic lymph node received prescribed dose of 45 Gy and 55 Gy): prescription dose was 45 Gy in 25 fractions to PTV_4500 with a simultaneous integrated boost of 55 Gy to PTV_5500. For 3D-CRT, two sequential phases were used (45 Gy/25 fractions to pelvis for phase I; FIGO IIIB 16 Gy/8 fractions, other stages 10 Gy/5 fractions boosting to pelvic wall for phase II). All EBRT was daily, 5 fractions per week. CT or MRI guided brachytherapy was performed 3-4 weeks after initiation of EBRT with a 192 Ir (iridium) high dose rate, once a week for a total of 4 times. e cumulative equivalent of >84 Gy (EQD2) for stage IB-IIIA and >90 Gy (EQD2) for ≥ stage IIIB were set for the cervical tumor. DVH parameters during EBRT were obtained from the Varian Eclipse treatment planning system (version 15.0).

Chemotherapy.
Concurrent cisplatin at 40 mg/m2 was given weekly during EBRT. Carboplatin (area under the curve (AUC) � 2 mg/ml/min) weekly was used as an alternative if creatinine clearance ≤50 ml/min. In cases involving long radiotherapy waiting time, induction chemotherapy with paclitaxel plus carboplatin was given. Chemotherapy was not recommended to patients aged over 70 or FIGO stage IB1.

Follow Up.
In the first 2 years of follow-up, all the patients had regular assessment every 3 months, then every 6 months in the third to fifth year, and yearly after the fifth year. OS was the time from the start of EBRT to the date of death from any cause or the last confirmed date of survival.

Univariate Analysis and Multivariable Analysis.
Univariate Cox-pH analysis was conducted to generate hazard ratios (HRs) with confidence intervals (CIs) of each single risk factor's contribution for OS. e factors extracted by elastic-net models were applied to build the final multivariable Cox-pH model. e Concordance index (C-index) were then applied to show the performance of the final multivariable Cox-pH model. e final multivariable Cox-pH models for predicting OS were used to construct nomograms.

Elastic-Net
Modeling. Elastic-net regression is a type of penalized regression [22,23]. Elastic-net uses both L1 norm penalty and L2 norm penalty on the regression covariates, and uses a mixing parameter that defines the proportion (alpha parameter) of penalty applied to the covariates between both L1 and L2 norms. Taken together, the elastic-net regression method allows retention of correlated covariates, but also regularizes model predictors in a manner that allows for improved prediction performance.
Elastic-net models were constructed for the prediction of OS using a 5-folds cross-validation methodology in 100 iterations bootstrapping, to approximate the models' generalization abilities when lacking an external validation dataset [21,24]. To determine the important features for OS by elastic-net models, we selected the best alpha and lambda in the elastic-net model by the criteria of C-index. e features with significant coefficient in elastic-net models and high selected frequencies in bootstrapping were selected as important factors.

Statistical Considerations.
All continuous features were normalized in log10(x + 1). All statistical analyses were performed by R software (version 4.0.2, R Development Core Team, Vienna, Austria). e R package glmnet was used to implement elastic-net modeling. P value less than 0.05 was considered statistically significant.

Patient Characteristics.
A total of 128 patients were assessed as eligible for inclusion in this study. Table 1 lists detailed characteristics of the study population. e median

DVH Parameters.
In this study, 20 DVH features were extracted, including dmax, dmean, and volume of tumor targets (GTV_P, GTV_N, PTV_4500, and PTV_5500), and dmax, dmean, V5, V45, and volume of OARs (body and bones) ( Table 2). As summarized in Table 2 Pearson's correlations between all DVH parameters are shown in Figure 1. Relatively strong positive correlations among the dosimetry of the tumor were found (Pearson's correlations > 0.5). While there were little correlations between clinical characteristics and DVH parameters, also little correlations among tumor dosimetry and OARs dosimetry.

Prediction Performances of Elastic-Net Models.
To study the risk factors of survival, three kinds of elastic-net models were established, including the model with integrating clinical factors and DVH parameters, with only clinical factors and with only DVH parameters. ese three models  Journal of Oncology had best prediction performances when alpha parameters equal to 0.8, 0.7, and 0.5, respectively. e prediction metric C-index was used to evaluate and compare the performances of three models in the train set and the test set as shown in   to survival, future more indicated that DVH parameters applied complementary information of clinical factors in survival prediction.

Important Factors in Elastic-Net Models.
e performances of all factors in the models with integrating clinical and DVH parameters were summarized, including the mean-and P value of their coefficients in the elastic-net models and the selected frequencies in 100 iterations, as shown in Figure 3 and (Supplemental Table 1). In clinical factors, the hemoglobin level at baseline was an important protective factor from death (mean coefficient: 0.47, 95%CI: 0.38-0.57, P value: <0.01, frequency:72%). In DVH parameters, both GTV_P volume and body V5 are the most promotive factors for death (mean coefficient: 1.26, 95%CI: 1.21-1.32, P value: <0.01, frequency: 92%; mean coefficient: 2.54, 95%CI: 2.1-3.09, P value: <0.01, frequency: 90%, respectively).

e Final Multivariable Cox-pH Model.
For the possibility of clinical usage, the final multivariable Cox-pH model integrating the key clinical characteristics (hemoglobin at baseline) and DVH parameters (GTV_P volume and body V5) was constructed as shown in Figure 4(a)

Discussion
e present study analyzed the effects of dosimetric parameters and clinical characteristics on OS by machine learning algorithms.
e results showed that elastic-net models with integrating clinical and DVH factors had best prediction performances (C-index 0.76 in the train set and C-index 0.74 in the test set). ree important factors were selected, including baseline hemoglobin level, primary tumor volume (GTV_P), and body V5. e final multivariable Cox-pH model constructed using these important factors had prediction performance (C-index: 0.78, 95%CI: 0.73-0.81) better than previous studies [25][26][27]. It indicated that the addition of DVH parameters to clinical factors in the model improved the prediction ability for OS. At the same time, the final multivariable Cox-pH model and the nomogram plot with only three readily available indicators in practice making it feasible in clinical application.
In clinical factors, our study found that the hemoglobin level at baseline was an important protective factor from death which was widely acknowledged. Many other studies have reached similar conclusions. Pretreatment hemoglobin was found to be a potential biomarker for survival prognosis in not only early cervical cancer [28] but also locally advanced cervical carcinoma [12]. e first international expert consensus guideline informing a minimum hemoglobin transfusion target of 90 g/L was endorsed to balance tumor radiosensitivity with appropriate use of a scarce resource for patients with cervical cancer undergoing EBRT and brachytherapy [29]. e hemoglobin level more than 90 g/L at presentation was positively associated with a 5-year OS rate [30]. A new score identified <120 g/L for hemoglobin at the time of diagnosis impacted disease free survival (DFS) and OS [11].
In DVH parameters, both GTV_P volume and body V5 were the most promotive factors for death. It is consistent with the conclusions of other studies that the larger GTV_P volume, the worse the survival. e 5-year survival rate of cervical cancer patients with tumor volume <40 cm 3 was significantly better than that of patients with >40 cm 3 [31]. e total volume of metabolic tumors was an independent prognostic factor for the recurrence-free survival of patients undergoing radical radiotherapy and chemotherapy for cervical cancer [32]. Researches on other tumors also support this conclusion. GTV_P volume ≥5 cm 3 was associated with a significantly worse OS in patients with sinonasal mucosal melanoma [33]. Another finding suggested that a pathological tumor volume of ≥18 cm 3 was significantly correlated with shorter OS of oral squamous cell carcinoma [34]. Similar conclusion was also found in rectum cancer [35], nasopharyngeal carcinoma [36], supraglottic carcinoma [37], and glioblastoma [38].
Body V5 is, especially, an important risk DVH parameters we found for survival, which was little considered in radiation therapy before. ere are two types of radiation health effect, including acute and late on-set disorders. Clinical symptoms of acute disorder begin with a decrease in lymphocytes, and then the symptoms appear, such as alopecia, skin erythema, hematopoietic damage, gastrointestinal damage, and central nervous system damage, with increasing radiation dose [39]. Body radiation can potentially result in both acute and long-lasting adverse effects, particularly, on hematopoietic and immune cells [40]. Studies have shown that radiation-induced lymphocytopenia is associated with poor prognosis in solid tumors [41], such as cervical cancer [42] and non-small cell lung cancer [43]. Regarding the late on-set disorder, predominant health effects are cancer [44][45][46], non-cancer disease [47,48], and the genetic effect [49][50][51]. In addition, it should be noted that with the development of modern radiotherapy techniques, such as intensity-modulated radiotherapy (IMRT), patients receive a larger volume of low-dose radiation. Body dosevolume distributions may influence the risk of second primary cancer [52]. Moreover, radiation-induced normal tissue damage and repair also has a dose-volume effect [53].
ere are some limitations in this study. First of all, this is a retrospective study. A prospective study is needed to collect more complete data. Secondly, since the international cervical cancer staging system does not include prognostic biomarkers, and current treatment recommendations are mainly based on staging, we did not include nonanatomical prognostic biomarkers, such as human papillomavirus (HPV) infection data and SCC-Ag values.
irdly, the median follow-up for our analysis was 26.4 months, and longer follow-up is needed to fully assess long-term survival benefits. Lastly, although a 5-folds cross-validation methodology in 100 iterations bootstrapping was used to assure the models' generalization abilities, an external validation is needed in the future study. Nonetheless, the findings of our study provide valuable data to guide clinical practice and future research.
In conclusion, this is the first attempt to establish elasticnet models to evaluate the roles of DVH parameters in predicting OS in patients with cervical cancer. In addition to clinical factors, DVH parameters such as GTV_P volume and body V5 appear to be important predictors of survival outcome. ese results can facilitate individualized tailoring of treatment and patient counseling in the holistic management of cervical cancer.

Data Availability
e datasets analyzed during the current study are not publicly available because the data are strictly confidential, but are available from the corresponding author upon a reasonable request.
Ethical Approval e authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. All patients signed, at hospital admission, consent for the use of their data for retrospective and scientific Journal of Oncology 7 investigation. e paper has been performed in accordance with the Declaration of Helsinki and has been approved by the local ethics committee.

Consent
e authors affirm that human research participants provided informed consent for publication.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
Conception and design were performed by Zhiyuan Xu, Li Yang, Longhua Chen, and Hao Yu. Administrative support was performed by Zhiyuan Xu and Li Yang. Provision of study materials or patients was performed by Zhiyuan Xu, Li Yang, and Qin Liu. Collection and assembly of data were performed by Longhua Chen, Hao Yu, Zhiyuan Xu, Li Yang, and Qin Liu. Data analysis and interpretation were performed by Hao Yu, Longhua Chen, Zhiyuan Xu, and Li Yang. Manuscript writing were performed by all the authors. Final approval of manuscript were performed by all the authors. Zhiyuan Xu and Li Yang contributed equally.