Predictive Model for Overall Survival and Cancer-Specific Survival in Patients with Esophageal Adenocarcinoma

Objective Recent years, there has been a rapid increase in the incidence of esophageal adenocarcinoma (EAC), while the prognosis for patients diagnosed remains poor and has slightly improved. Methods We extracted 6,466 cases with detailed demographical characteristics including age at diagnosis, sex, ethnicity, marital status, and clinical features, involving tumor grade and stage at diagnosis and treatment modalities (radiation therapy, chemotherapy, and surgery) from the Surveillance, Epidemiology, and End Results (SEER) (1975–2017) dataset. They were further randomly divided into the training and validating cohorts. Univariate and multivariate Cox analyses were conducted to determine significant variables for construction of nomogram. The predictive power of the model was then assessed by Harrell concordance index (C-index) and the area under the curve (AUC) of the receiver operating characteristic (ROC) curve. Results Multivariate analysis revealed that age, marital status, insurance, tumor grade, TNM stage, surgery, and chemotherapy all showed a significant association with overall survival (OS) and cancer-specific survival (CSS). These characteristics were employed to build a nomogram. Particularly, the discrimination of nomogram for OS and CSS prediction in the training set were excellent (C-index = 0.762, 95% CI: 0.754–0.770 and C-index = 0.774, 95% CI: 0.766–0.782). The AUC of the nomogram for predicting 2- and 5-year OS was 0.834 and 0.853 and CSS was 0.844 and 0.866. Similar results were observed in the internal validation set. Conclusion We have successfully established a novel nomogram for predicting OS and CSS in EAC patients with good accuracy, which can help clinicians predict the survival of individual patient survival and provide optimal treatment strategies.


Introduction
e estimated incidence of esophageal adenocarcinoma (EAC) in the United States was 17,650 in 2019 [1], and the incidence rate of EAC has surpassed that of esophageal squamous cell carcinoma (ESCC), becoming the main histologic type of esophageal cancer in the West [2][3][4]. Despite a significant increase in its incidence, the 5-year survival for EAC has improved only marginally, from 9% in the 1970s to 22% in 2009 [5]. Prior epidemiological studies have demonstrated associations between EAC and family history, smoking, older age, male gender, central obesity, gastroesophageal reflux disease (GERD), and Barrett's esophagus (BE) [6]. However, it is evident that the prognostic system derived from the Kaplan-Meier estimator becomes less relevant over time after diagnosis [7,8], alarming a need for an improved predictive survival system for EAC patients.
According to the population-based Surveillance, Epidemiology, and End Results (SEER), multinomograms were developed based on a multivariate regression model for esophageal cancer (EC) [9,10]. Currently, neoadjuvant chemoradiation followed by esophagectomy (trimodality therapy) is the standard treatment of locally advanced esophageal carcinoma [11], but a significant proportion of patients relapse and die after treatment. Despite several prognostic evaluations assessed trimodality therapy or pharmaceuticals treatments (proton pump inhibitors (PPIs), statins, nonsteroidal anti-inflammatory drugs (NSAIDs) and metformin) impacting the outcomes [12,13], an ideal prognostic model with the value of accuracy and applicability for EAC needs to be set.
In this study, we developed a nomogram with a multivariate Cox proportional hazards regression model that incorporates comprehensive demographic and baseline clinical variables, including age, race, insurance and marital status, tumor grade, primary site, clinical stage, chemotherapy, surgery, and radiotherapy strategy. Using scaled line segments, various forecast indicators were listed and scored, and we developed and validated a new model predicting the overall survival (OS) and cancer-specific survival (CSS) for EAC patients. And the Harrell C-index and the area under the curve (AUC) of the receiver operating characteristic (ROC) curve used to indicate the performance of the nomogram were excellent.
us, we believed this established novel nomogram for patients with EAC could assist clinicians in predicting the survival of individual patient.

Patients.
A total of 39,783 EAC patients (between 1975 and 2017) were identified from the SEER registry database of the National Cancer Institute using SEER * Stat software (version. 8.3.5), which covers about 28% of the US population and contains a large amount of evidence-based medical information [14]. Patients with the incomplete 7th edition of the American Joint Committee on Cancer (AJCC) Tumor-Node-Metastasis (TNM) staging system were excluded. en, the patients with multiple primaries tumors were further excluded. In addition, patients with incomplete survival data, missing data in SEER cause-specific death classification, unknown surgery, unknown grade, unknown location, unknown race, unknown insurance, and unknown marital status were also excluded from the study. Finally, 6,466 cases enrolled were randomly assigned into the training set (4,528) and validation set (1,938) ( Figure 1). Because all of the data used in this study were obtained from the SEER database with a publicly available method, no local ethical approval or declaration was required for this study. All data used in this study are publicly available (https://seer. cancer.gov/).

Construction and Validation of the Nomogram.
e data of training cohort was used to establish the nomogram. e endpoint OS and CSS were measured from the date of first diagnosis to the date of any cause of death. Survival was estimated using the Kaplan-Meier method and Cox regression analysis. Univariate and multivariate analyses were performed to determine independent prognostic variables. en, nomograms to predict the 2-and 5-year OS and CSS rates were constructed using the results of the multivariate analysis showing significance. e discriminatory performance of the nomograms was assessed by C-index and AUC. Calibration curves were created using the marginal estimation and the average prediction probability of the model. Furthermore, the nomograms were also compared to the AJCC 7th TNM stage in terms of C-index and AUC.

Statistical Analysis.
Participant demographics were compared using the X 2 test. All the statistical analyses were performed using R version 3.4.2 software (the R Foundation for Statistical Computing, Vienna, Austria. http://http:// www.r-project.org). A two tailed p < 0.05 was considered statistically significant.

Patient Characteristics.
e demographic and clinical variables are listed in Table 1

Patient Prognosis Analysis in the Training Cohort.
e univariate and multivariate analyses in training cohort are listed in Table 2, and the values of multivariate were further assessed in condition of the p < 0.200 in the univariate analysis in terms of OS and CSS. Even male predominance in incidence is stronger than female as reported [15,16], and there was negative discrepancy here. In univariate models for OS, age, race, marital status, insurance, tumor differentiation grade, primary site, tumor staging, surgery, chemotherapy, and radiation therapy (overall p < 0.05) were significantly associated with OS. In the multivariable age groups above 60 years (hazard ratio (HR) � 1.198, 1.567, 2.212; 95%), marital status (p � 0.004), insurance (overall p < 0.005), poor tumor differentiation grade (grade III: HR � 1.455, 95% CI: 1.028-2.059, p � 0.034; grade IV: HR � 1.558, 95% CI: 1.327-1.829, p ≤ 0.001), tumor staging (overall p ≤ 0.001), surgery, and chemotherapy (overall p ≤ 0.001) were independent predictors for OS. In the univariate and multivariate analyses of CSS, the parameters significantly associated with survival were consistent with the items of OS. In particular, radiotherapy did not impact OS or CSS of EAC patients with p values 0.354 and 0.289, respectively.

Nomograms for Predicting OS and CSS of EAC Patients.
e nomograms based on the multivariate Cox regression models were developed to estimate 2-year and 5-year OS probabilities and CSS probabilities ( Figure 2). By adding up the scores for each selected variable, a patient's probability of individual survival can be easily calculated, and the performance of the nomograms was assessed by calculating Harrell's C-index. e OS and CSS were better for patients under the age of 60, patients with comparative better tumor differentiation and early stages, patients insured and married, and patients received surgery or chemotherapy. e C-index for the nomogram to predict OS was 0.762 (95% CI: 0.754-0.770) for the training cohort and 0.770 (95% CI: 0.758-0.782) for the validation cohort. And nomogram accuracy for CSS prediction was observed with a C-index of 0.774 (95% CI: 0.766-0.782) for the training cohort and 0.783 (95% CI: 0.770-0.797) for the validation cohort. e nomogram for OS and CSS prediction demonstrated relatively good accuracy comparing to AJJC 7th TNM stage (Table 3).
en, calibration plots of 2-and 5-year OS probabilities confirmed optimal agreement between the nomogram-predicted survival and actual observations in   (Table 4).

Discussion
A previous study using the SEER 1973-2009 dataset reported that the overall 5-year survival rate was 9-22% in all EC patients [17]. Furthermore, the United States Cancer Statistics in 2018 reported that the 5-year overall relative survival of EC was 19% (2008 to 2014), and a hospital-based pooled analysis in China reported that the 5-year overall survival was around 40%, with an increase over time from 2000 to 2018 [18,19]. Overall, the overall prognosis in EC is poor. Over the past 30 years, the incidence of EAC rapidly increased and had surpassed that of ESCC in a number of Western countries, including the United Kingdom (UK), the Netherlands, Ireland, New Zealand, the United States (US), Australia, Denmark, Canada, and Sweden [3,20,21]. With the steady increase in the number of EAC, there is a growing need for accurate estimates of disease outcomes. Using the rich data sources, the SEER-Medicare population, we identified 6,466 patients diagnosed with EAC between 1975 and 2017, which allowed for reliable analyses of subgroups and trends in survival after diagnosis. Furthermore, excellent predictive power of nomograms was confirmed by the higher C-index and AUC value comparatively both in the training and validation sets than the AJCC 7th TNM stage system.
In this study, we constructed well-calibrated prognostic nomograms to predict OS and CSS in patients with EAC. Consisting with prior research studies, predictive parameters including age, marital status, insurance, tumor differentiation, and TNM stage were associated with OS and CSS [12,[22][23][24][25]. Patients over 60 years of age, from a family relatively lack of care and support, with the poor tumor differentiation and in advanced stage had the worst prognosis. Interestingly, ethnic disparities and primary site that show independent prognostic factors in ESCC patients [25,26] were not significant values for OS and CSS in EAC patients. at may be need further evidences to confirm the value of these parameters.
Surgery is the primary treatment for EC. Even EAC patients who received surgery just account for 34.6% (including endoscopic therapy, esophagectomy, with   gastrectomy, and combination); our data showed that the OS and CSS of patients who underwent surgery were significantly longer than those who had no surgery. To our knowledge, patients with EAC more frequently received chemotherapy than patients with ESCC. Of note, ∼70.3% of patients experienced chemotherapy in our study, and chemotherapy also was an independent prognostic factor. Conversely, patients with ESCC were more likely to receive radiation therapy [10]. ∼58.6% of EAC patients here received radiotherapy, but suggested no significant association with prognosis. Radiotherapy plays a crucial role in the treatment of EC and almost was carried out before or after surgery. Our findings strengthen the previous study that showed no improvement in OS and CSS in stages I-III patients who received single or combined radiotherapy before and after surgery, compared with patients who did not experience radiotherapy [25]. ey required further evidence-based data to learn. e overall prognosis for patients has been markedly improved because of the awareness and surveillance of individual with Barrett's esophagus (BE), more accurate selection of patients for curative treatment, better surgical and perioperative therapy, and the addition of neoadjuvant chemotherapy or chemoradiotherapy for localized [5,6,27]. e postoperative mortality and complication rates of the disease are much higher compared to endoscopic therapy [28,29], and EC patients with stages I-III underwent endoscopic therapy had the association with the best outcome amongst all the surgical methods, including esophagectomy and esophagectomy with gastrectomy [25]. erefore, endoscopy, early screening of certain high-risk individuals to detect premalignant lesions, even further the option for treatment, is also a very important tool to guide the treatment and assessment of prognosis.
is retrospective remained several considerable limitations. First, the inherent selection bias was inevitable. Second, this report based on the majority population was Whites. Due to the distribution of the EC, ESCC remains the most frequent histological type in Asian, from northern Iran, east to China,  and north to Russia [30,31], but since EAC is extremely rare in China, we lack data from real-world studies to check it. Furthermore, endoscopic therapy currently is prevalent in clinic, whereas this study missed subgroup data involved. en, the etiology of EAC, including obesity [32], Helicobacter pylori infection [33,34], tobacco smoking [32,35], alcohol consumption, dietary factor [36], medication [13,37], and genetic factor [38,39], definitely impact the patients' survival, but were not involved here due to information incomplete. Although the tumor response to neoadjuvant chemotherapy or chemoradiotherapy is another important prognostic factor [27,40], there are currently no known biomarkers or diagnostic modalities that can reliably predict a patient's response to neoadjuvant chemoradiation. Unfortunately, the addition of neoadjuvant chemotherapy or chemoradiotherapy for localized EAC was not discussed. Finally, the SEER-Medicare population research screened EAC cases from 1975 to 2017; there were considerable varies uncontrolled. Even the database provided the number of patients who received chemotherapy, radiotherapy, or surgery alone or in combination (Supplemental Table 1); we did not do more clarification about treatment strategies. e data available for analysis will be significantly reduced if we further group patients based on the time relationship between chemoradiotherapy and surgery (preoperative or postoperative). Correspondingly, the time association of most patients in the surgical group with chemoradiotherapy is unknown in the SEER database here, and the number of cases with detailed sequence is relatively significantly small. In general, we regret this confusion without more clarification. We believe the evidence will be further confirmed in future real-world studies. Ultimately, prospective multicentre studies are needed to validate and utilization this predictive nomogram.

Conclusion
We assessed a large number of cases and incorporated clinical information to construct and validate a universally applicable EAC prediction model that performed better C-index and AUC than the traditional TNM staging system.
is nomogram can forecast the dynamic and personalized OS and CSS of patients during follow-up after diagnosis. Age, marital status, insurance, tumor grade, TNM stage, surgery, and chemotherapy were significant independent predictors of OS and CSS. EAC patients can benefit from this nomogram and accept more aggressive posttherapy surveillance, and clinicians can be guided to select treatment plans.

Data Availability
Publicly available datasets were analyzed in this study. ese data are available in Surveillance, Epidemiology, and End Results (SEER) database (https://seer.cancer.gov/). e datasets generated in this study are available from the corresponding author upon request.

Additional Points
All the statistical analyses were performed using R version 3.4.2 software (the R Foundation for Statistical Computing, Vienna, Austria. http://http://www.r-project.org).