Validation of a Machine Learning Approach for Venous Thromboembolism Risk Prediction in Oncology

Using kernel machine learning (ML) and random optimization (RO) techniques, we recently developed a set of venous thromboembolism (VTE) risk predictors, which could be useful to devise a web interface for VTE risk stratification in chemotherapy-treated cancer patients. This study was designed to validate a model incorporating the two best predictors and to compare their combined performance with that of the currently recommended Khorana score (KS). Age, sex, tumor site/stage, hematological attributes, blood lipids, glycemic indexes, liver and kidney function, BMI, performance status, and supportive and anticancer drugs of 608 cancer outpatients were all entered in the model, with numerical attributes analyzed as continuous values. VTE rate was 7.1%. The VTE risk prediction performance of the combined model resulted in 2.30 positive likelihood ratio (+LR), 0.46 negative LR (−LR), and 4.88 HR (95% CI: 2.54–9.37), with a significant improvement over the KS [HR 1.73 (95% CI: 0.47–6.37)]. These results confirm that a ML approach might be of clinical value for VTE risk stratification in chemotherapy-treated cancer outpatients and suggest that the ML-RO model proposed could be useful to design a web service able to provide physicians with a graphical interface helping in the critical phase of decision making.


Introduction
In recent years, the grown availability of large sets of electronic health records (big data) has posed new challenging possibilities in terms of data management/analysis, as they exceed the concept of "statistical sampling" in favor of a heuristic search of correlations between phenomena for the construction of predictive models [1]. This is particularly true in oncology, to the point that the 2016 report of the Blue Ribbon Panel of the Cancer Moonshot recommended to mine past patient data for predicting future patient outcomes and for minimizing cancer treatment's debilitating side effects [2].
In this context, a compelling challenge in oncology is predicting the risk of chemotherapy-associated venous thromboembolism (VTE), as VTE occurrence may result in treatment delays, impaired quality of life, and increased mortality [3]. Accordingly, despite thromboprophylaxis for primary prevention is not recommended, assessment of the patient's individual risk of VTE prior to chemotherapy is advocated [4], based on Khorana Score (KS) [5], the sole risk assessment model (RAM) currently available for this clinical setting.
However, even though KS [5] is a user-friendly VTE risk predictor-based on routinely available variables [6]-it is strongly dependent on tumor type and does not consider treatment-related factors influencing VTE development. Therefore, its external validation was not univocal [7][8][9], its major weakness being represented by a high proportion of patients (>50%) falling into the intermediate risk category [9]. Thus, expanded RAMs including novel biomarkers, potentially improving VTE risk prediction, have been proposed [7]. Yet, their use may be too expensive for widespread screening in low-and middle-income regions.
In this light, we hypothesized that machine learning (ML) would be a solid base to build an inexpensive predictive tool for VTE risk assessment in chemotherapy-treated cancer outpatients that could be easily adapted to different local situations or field advancements [10]. We, therefore, applied a combined approach of kernel ML and random optimization (RO) to design a set of VTE predictors capable of exploiting significant patterns in routinely collected demographic, clinical, and biochemical data that can be used in a clinical decision support system for VTE risk stratification prior to chemotherapy [11]. Among these, we selected the two best predictors out of a range of ten ML-RO runs (ML-RO-2 and ML-RO-3), which could be useful to devise a webbased graphical interface for VTE risk stratification.
Here, we report the results of a monoinstitutional pilot study in which the ML-RO-2 and ML-RO-3 were combined to validate their clinical usefulness in a cohort of 608 ambulatory cancer patients, prospectively followed during chemotherapy at the medical oncology ward of the Tor Vergata Clinical Center.

Patients and Methods
2.1. Patient Dataset. The complete patient dataset for VTE risk assessment (n = 1433) was attained by joint efforts between the PTV Bio.Ca.Re. (Policlinico Tor Vergata Biospecimen Cancer Repository) and the BioBIM (Interinstitutional Multidisciplinary Biobank, IRCCS San Raffaele Pisana). The dataset consisted of ambulatory cancer patients in accordance with the principles embodied in the Declaration of Helsinki to investigate possible predictors of chemotherapy-associated VTE. The study was reviewed and approved by the Scientific Institute for Research, Hospitalization and Health Care San Raffaele Pisana and by the Tor Vergata University Institutional Review Boards. All study participants or their legal guardian provided informed written consent about personal and medical data collection prior to study enrollment.
Of the 1433 patients, 825 were included in the original training set used to devise the ML-RO predictors. Clinical characteristics and laboratory attributes of these patients are available at [11]. For the current study, a cohort of 608 patients was attained by implementing the testing set (n = 354) analyzed in [11] with patients enrolled thereafter (from July 2015 to June 2016). All patients were chemotherapy naive; specific anticancer treatment was instituted according to international guidelines (11% neoadjuvant, 29% adjuvant, and 60% metastatic; 3% of patients received concurrent radiotherapy). Eligibility criteria were as previously reported [10,12]. Patients were regularly seen at scheduled visits; additional visits were arranged at the occurrence of clinically suspected VTE. Initial VTE risk stratification was performed by the KS at a 3-point cutoff, as currently recommended [5]. All patients were followed up for a median period of 10 months, during which outcomes were prospectively recorded. The study outcome was defined as the occurrence of a first symptomatic or asymptomatic VTE episode, either deep vein thrombosis (DVT) or pulmonary embolism (PE), during active treatment. No patient received thromboprophylaxis or antiplatelet drugs.
The following variables were taken into consideration: age, sex, tumor site and stage, hematological attributes (including blood cell counts, hemoglobin, and neutrophiland platelet-lymphocyte ratios), fasting blood lipids [13], glycemic indexes [14], liver and kidney function [15], body mass index (BMI), Eastern Cooperative Oncology Group Performance Status (ECOG-PS), and supportive and anticancer drugs. Numerical attributes were analyzed as continuous values. Variables were clustered into groups according to clinical significance [11]. Table 1 summarizes clinical and laboratory attributes of patients.

Data Analysis.
In a context of precision medicine, we introduced a new methodology based on a particular class of learning machines (kernel machines) and on a RO model to devise relative importance of different groups of clinical attributes in the final prediction decisions [11]. The algorithm was devised as previously reported using a 3-fold cross validation technique on a training set. A testing set was used to compute the final performance of our risk predictors. Missing clinical attribute values were treated according to predictive value imputation (PVI) method [16].
A total of 608 patients were entered into the study on the hypothesis that this will detect a difference with a likelihood of >80%, at a two-sided 5% significance level, if the true hazard ratio (HR) is 2. This was based on the assumption of a median follow-up duration of at least 6 months and an estimated VTE rate of 10%. Patients' data are presented as percentages, mean (SD), or median and interquartile range (IQR). Receiver operating characteristic (ROC) curve and Cox proportional hazard analyses were performed by Med-Calc Statistical Software version 13.1.2 (MedCalc Software bvba, Ostend, Belgium). Bayesian analysis was performed, and positive (+LR) and negative (−LR) likelihood ratios were used to estimate the probability of having or not having VTE. Survival curves were calculated by the Kaplan-Meier and log-rank methods using a computer software package (Statistica 8.0, StatSoft Inc., Tulsa, OK). VTE-free survival time was calculated from the date of enrollment until the date of VTE (either DVT or PE) or of the last follow-up. For administrative censoring, follow-up was ended at the date of December 20th, 2016. For patients receiving neoadjuvant chemotherapy, follow-up was stopped at completion of an entire antiblastic treatment and before surgery.

Results and Discussion
No patient underwent surgery during follow-up nor was admitted to a clinic for acute medical illness requiring thromboprophylaxis. VTE was diagnosed in 7.1% of patients (11 PE and 32 DVT; median time to VTE: 2.5 months), and 21 of 43 patients were incidentally diagnosed with asymptomatic VTE (7 PE) at time of CT scan for restaging, in agreement with previous reports [12,13]. Competing mortality at 6 months was <2%, and 9 patients without VTE died of their disease during this time frame.
Overall, 37 (6.1%) patients were at high risk for VTE (KS ≥ 3), as per current guidelines. Of these, only 4 (10.8%) patients developed VTE during treatment. On the other hand, 250 (41.5%) patients had an intermediate risk (KS 1 or 2), whereas 318 (52.4%) were classified as low risk based on a KS of 0. VTE rates in the intermediate-and low-risk categories were 9.2% (n = 23) and 5.0% (n = 16), respectively. Three patients with glioblastoma were not included in the analysis, as the KS is not validated in this cancer type. Accordingly, the overall performance of KS in our population, despite a 94.1% specificity, was characterized by a 9.3% sensitivity, a 10.   used for the development of KS-showing a high negative predictive value (98.5%), but a PPV lower than 7% [6]-and those by other authors reporting that the majority of events (50% to 85%) occurs in patients at intermediate risk [7,17,18]. In this context, it is conceivable to hypothesize that clinical settings, different from that in which KS was originally developed, might be responsible for the inconsistencies observed among various studies. Undoubtedly, KS represents an interesting endeavor for VTE risk prediction, owing to its ease of use and lack of additional health care costs. However, several reports recently demonstrated that it might not be suitable in specific local situation/populations, such as in the case of lung [9,19,20] or pancreatic [21] cancer, where the KS does not correctly stratify patients using a threshold of ≥3 versus <3. An additional explanation of these discrepancies stems from the fact that no information on anticancer [9] or supportive drugs was available for the population used for the development and validation of the KS. Furthermore, Lee and coworkers [19] suggested that the lack of predictive significance of the KS in particular clinical settings could be explained by differences in the proportion of patients with BMI ≥ 35 (e.g., 0.4% in their study versus 12.3% in the one by Khorana et al.), raising the hypothesis that "an area-specific cutoff point for BMI among the Khorana variables should be taken into consideration" in different ethnicities [19].
In this context, the availability of a ML approach that can be locally customized and personalized on individual patient attributes is intriguing. For the present analysis, we selected ML-RO-2 and ML-RO-3 as the best performing risk predictors based on the values of precision [(P) positive predictive value in ML], recall [(R) sensitivity in ML], and f-measure [a harmonic mean of P and R calculated as: 2PR/(P + R)] as previously reported [11]. Here, using an extended dataset of 608 patients, both ML-RO-2 and ML-RO-3 showed f-measures of 0.213 and 0.211, respectively, which were substantially higher than that calculated for the KS (f-measure: 0.100) and similar to those originally reported [11], thus confirming the clinical soundness of this approach.
At this point, it is important to emphasize that the two models not only were the best in terms of prediction capacity but they also had a complementary configuration of weights ( Figure 2). In particular, ML-RO-2 was strongly weighted on blood lipids, BMI, and ECOG performance status, while ML-RO-3 had the highest weights for age and blood lipids [11]. This is consistent with literature data showing that low levels of HDL cholesterol [13] and ECOG-PS [22] are among the best predictors of increased VTE risk in chemotherapy-treated cancer patients in multiple regression models. Moreover, tumor site and stage and anticancer drugs maintained a considerable weight in both models (Figure 2), which is not surprising, since these clinical attributes have also been associated with an increased risk of developing VTE [6,21,23].
Nevertheless, the performance of both predictors could be further enhanced. Thus, we sought to investigate whether a combined approach may be of advantage over the individual predictors or the KS. It should be noted that the adoption of a model incorporating a couple of predictors implies that risk evaluation would be represented by a three-level stratification (generated in the event that risk estimate is achieved by both predictors, only one or none of them). However, while this configuration is capable of reducing the number of false negative and false positive, it introduces some degree of uncertainty represented by an intermediate risk class. As reported in Figure 1, the combined model resulted in an overall improvement of VTE risk prediction performance, with a 0.716 AUROC, which was significantly higher than that observed with each single predictor The robustness of this combined model was further corroborated by the results of a VTE-free survival analysis in which patients were considered at risk only in the event of a concordance of both predictors. As shown in Figure 3(a), only 3.4% of patients classified as low risk by the combined ML predictor developed VTE during chemotherapy, compared with 14.9% classified as at risk (log-rank test = 5.29; p < 0 0001). On the other hand, despite the high specificity, the KS used at a cutoff ≥ 3 points, as currently recommended, resulted in a 6-month VTE-free survival rate not significantly different from that of low-risk patients (89% versus 94%, resp.; log-rank test = 1.01; p = 0 309) (Figure 3(b)). Of interest, the predictive value of the combined ML-RO model was confirmed in a subgroup analysis of patients with tumors generally considered as at low (0 point in the Khorana score) (i.e., breast or colorectal cancers) or intermediate (1 point in the Khorana score) (i.e., lung, gynecologic, or urinary cancers) VTE risk (Figure 4), which further suggest a ML approach may be of advantage over the currently recommended KS.
These results demonstrate that a ML approach, optimizing the relative weight of groups of clinical attributes, might be of clinical value in predicting a first VTE episode in chemotherapy-treated cancer outpatients compared to other RAMs, which are based on the arbitrary assignment of a score according to multivariable analyses.
There are, of course, some limitations to acknowledge. First, the study was monoinstitutional. Second, the sample size was relatively small, ultimately leading to a small number of recorded events. Nonetheless, the data reported here demonstrate that the use of ML algorithms and RO models might be of advantage in developing local classifiers capable of improving VTE risk prediction, while retaining some advantages (e.g., recalculation based on data advance over time) in a perspective of precision medicine. Furthermore, the model proposed here has the unquestionable strength that, since all the variables are usually included in the workout routine of cancer patients, the risk calculation is practically at no cost to the health system. Future application of a ML approach might help oncologists in the difficult phase of decision making, by providing them with the great advantage of limiting observer subjectivity. In particular, the combined use of a set of ML-RO predictors could be useful to design a web service with a graphical interface supporting the oncologist in the critical phase of VTE risk assessment. At present, we are working on the architecture of the decision server and its implementation with the best kernel functions to estimate the risk of VTE on a binary value (at risk and low risk).

Conclusions
As the world moves toward a big data scenario [24], the possibility to use a machine learning approach to devise a RAM-taking into consideration individual biological variability, environmental exposure, and lifestyle-is particularly appealing and fits well into a context of precision medicine as advocated by the Cancer Moonshot initiative.

Conflicts of Interest
The authors declare that they have no conflicts of interest.

Authors' Contributions
Patrizia Ferroni and Mario Roselli designed the study, analyzed and interpreted the clinical data, and wrote the manuscript. Fabio M. Zanzotto and Noemi Scarpato designed the algorithm, performed the machine learning experiments, and wrote the manuscript. Silvia Riondino collected clinical and laboratory data, interpreted the data, and wrote the manuscript. Fiorella Guadagni designed the study, analyzed and interpreted the data, and critically revised the manuscript. All authors revised and approved the final version of the manuscript. Fiorella Guadagni and Mario Roselli are senior authors for equal contribution.