Prognostic Factors and a Nomogram Predicting Overall Survival and Cancer-Specific Survival for Patients with Collecting Duct Renal Cell Carcinoma

Background Collecting duct renal cell carcinoma (CDRCC) is a rare type of renal cancer characterized by a poor prognosis. The aim of this work was to develop a nomogram predicting the overall survival (OS) and cancer-specific survival (CSS) for patients with CDRCC. Methods A total of 324 eligible patients diagnosed with CDRCC from 2004 to 2015 were identified using the data from the Surveillance, Epidemiology, and End Results (SEER) database. The Kaplan-Meier curve was used to estimate the 1-, 3-, and 5-year OS and CSS of these patients. Univariate and multivariate Cox regression models were performed to identify the independent risk factors associated with OS and CSS. The nomogram was developed based on these factors and evaluated by the concordance index (C-index) and calibration curves using the bootstrap resample method. The predictive accuracy of the nomogram was also compared with the manual of the American Joint Committee on Cancer (AJCC). Results The estimated 1-, -3, and 5-year OS and CSS rates in the analytic cohorts were 56.4% and 60%, 32.5% and 37.3%, and 28.7% and 33.6%, respectively. The multivariate model revealed that age, tumor size, tumor grade, N stage, M stage, surgical type, and chemotherapy were independent predicted factors for OS, while tumor size, tumor grade, N stage, M stage, surgical type, and chemotherapy were independently linked to CSS. A nomogram was developed using these factors with relatively good discrimination and calibration. The C-index for OS and CSS was 0.764 (95% CI: 0.735~0.793) and 0.783 (95% CI: 0.754~0.812), which was superior to the AJCC stage (C-index: 0.685 (95% CI: 0.654~0.716) and 0.703 (95% CI: 0.672~0.734)). Patients were divided into low-risk, intermediate-risk, and high-risk groups according to the total points calculated by the nomogram. Patients in the low-risk group (97 mo and not reached) experienced significantly long median OS and CSS compared to the intermediate-risk (17 mo and 18 mo) and high-risk groups (5 mo for both). The calibration curves showed a good agreement between the predicted and actual probability related to OS and CSS. Conclusion CDRCC has an aggressively biologic behavior with relatively poor prognosis. A survival prediction nomogram making an individualized evaluation of OS and CSS in patients with CDRCC was presented, potentially helping urologists to make a better risk stratification.


Introduction
Renal cell carcinoma (RCC) is one of the most common human malignancies worldwide, and its incidence steadily increased in most countries [1]. The prognosis in patients with RCC is generally favorable, with 5-year overall survival (OS) and cancer-specific (CSS) rates of 73.2%~87.9% and 84%~95%, respectively [2,3]. Nowadays, many prognostic models like UISS, SSIGN, and Leibovich have been established to predict the oncologic outcome in patients with RCC [4][5][6]. The cancer stage manual American Joint Com-mittee on Cancer (AJCC) is the most commonly used model in clinical practice and includes the T stage, N stage, and M stage and comprehensively divided patients into I~IV groups [7]. However, the above models were established mainly based on a population of a clear cell subtype. It is currently unknown whether it is still accurate enough to predict the prognosis for other subtypes of RCC.
Collecting duct renal cell carcinoma (CDRCC) is a rare but aggressive histologic subtype of RCC, estimated to comprise less than 1% of the entire cohort [8][9][10][11]. Few studies explored the survival outcomes and prognostic factors of CDRCC due to the rarity of this subtype. Most studies regarding CDRCC are based on case reports and series from a single center with a limited sample size, which cannot provide comprehensive insights for urologist [12][13][14]. The largest series to date include 577 CDRCC collected from the National Cancer Database, which revealed that this subtype is drastically more aggressive than clear cell carcinoma (CCRCC) [9]. Patients with CDRCC have higher tumor grade (G3+G4: 62.3% vs. 24.4%), advanced T stage (T3 +T4: 57% vs. 20.9%), N stage (N1: 27.7% vs. 2.3%), and M stage (M1: 32.1% vs. 13%) and shorter median survival (12.3 mo vs. 122.5 mo) as compared to those with CCRCC [9]. A recent study performed a retrospective analysis on 69785 patients with RCC including 280 patients with CDRCC and revealed that CDRCC not only has more advanced TNM stage than CCRCC but also shows higher cancer-specific mortality even after matching with G4 CCRCC (HR: 1.6, P < 0:01) [15]. Besides, CDRCC prognosis varies widely among previous studies, with reported median survival ranging from 13 months to 4.9 years [8-11, 15, 16]. The main reason for the variety of the prognosis could be the heterogenous risks of patients among studies. Therefore, it is an urgent matter for a urologist to establish a specific prognostic model to assess risk stratification in patients with CDRCC to accurately inform patients on their long-term survival. The rarity of this subtype makes the study difficult in a large-scale prospective manner. Therefore, in this work, a nomogram was developed to predict OS and CSS in patients with CDRCC using the Surveillance, Epidemiology, and End Results (SEER) database.

Study Population.
The SEER database is a populationbased cancer database that collects data from 18 registries among 14 states and covers around 28% of the population across the USA. A retrospective cohort study was conducted using the SEER database of the National Cancer Institute (http://seer.cancer.gov/). The datasets of patients in the present research were downloaded from the SEER * Stat 8.3.9 software. Patients diagnosed with CDRCC (histological diagnostic code 8319/03 in the International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3)) from 2004 to 2015 were included in this study. Patients with missing data on baseline characteristics and follow-up were excluded. Finally, 324 eligible patients were included for further analysis.

Variable
Collection. The considered variables were age at diagnosis, race, gender, tumor laterality, year of diagnosis, marital status, tumor grade, tumor size, AJCC stage, clinical T stage, N stage, M stage, surgical type, radiotherapy, chemotherapy, follow-up time, cancer-specific death, and death of any other cause. Age at diagnosis, tumor size, and followup time were recorded as continuous variables. The others were recorded as categorical variables. Patients were restaged according to the 8th edition AJCC Cancer Staging Manual. The surgical type was recorded as "without," "nephronsparing surgery (NSS)," and "radical nephrectomy (RN)." The adjuvant treatment including chemotherapy and radiotherapy was recorded as "with or without." The primary outcome of this study was the overall survival (OS), which was defined as the time interval between the day of diagnosis and death of any other cause. The secondary outcome in this study was cancer-specific survival (CSS), which was defined as the time interval between the day of diagnosis and cancerspecific death.
2.3. Statistical Analysis. Nomogram establishment and calibration were performed by using R software (Version 4.0.3) using the "rms" package, and other statistical analyses were performed using SPSS (version 26, IBM, Armonk, NY). The continuous variable was reported as median with interquartile ranges (IQR), and the categorical variable was reported as the whole numbers and proportions. The Kaplan-Meier method was used to estimate the 1-year, 3year, and 5-year OS and CSS in the study cohort. Univariable and multivariable Cox proportional hazard regressions were performed to identify the independent prognostic factors associated with OS and CSS (forward stepwise selection methods). The selected independent factors were incorporated in the nomograms to predict the probability of 1-year, 3-year, and 5-year OS and CSS. The discrimination of the nomogram was measured by the concordance index (Cindex), which ranges from 0.5 (no predictive power) to 1 (perfect prediction) [17]. The Kaplan-Meier curve and logrank test were also performed to evaluate the ability of the risk stratification of the nomogram associated with OS and CSS. The calibration was evaluated using a calibration curve, which was assessed between the observed outcome probability and the nomogram-predicted probability, with a bootstrap resample of 1000 times. Besides, the C-index of the conventional AJCC stage was also calculated and compared to that of the established nomogram. All tests were twosided, and P < 0:05 was considered statistically significant.

Patients' Characteristics.
A total of 324 patients were identified and included in this study. The median age of CDRCC was 61.5 (IQR: 53~72) years, and 223 (68.8%) cases were male (

Nomogram Development and Validation.
Two nomograms incorporating the above-mentioned independent prognostic factors were developed to predict the 1-, 3-, and 5-year OS and CSS in patients with CDRCC ( Figure 1). The C-index of the nomogram predicting OS and CSS was 0.764 (95% CI: 0.735~0.793) and 0.783 (95% CI: 0.754~0.812), respectively. The predicted probability of OS and CSS was then plotted as Kaplan-Meier curves stratified by the tertile of the predicted probability calculated from the nomogram to further assess the discriminative ability of the nomogram. The median OS and CSS were significantly longer in the low-risk (first tertile: 97 mo and not reached) group than in the intermediate-risk (second tertile: 17 mo and 18 mo) and high-risk (third tertile: 5 mo for both) groups, which also indicated a good discrimination of the established nomograms (Figures 2(a) and 2(b)). The accuracy of the nomogram and potential model overfit were assessed by the bootstrap validation with 1000 resamplings.
The calibration curves showed a good agreement between the predicted and actual probability related to OS and CSS ( Figure 3). Our nomogram was compared with the conventional AJCC stage to further verify the predictive accuracy. The C-index of the conventional AJCC stage was 0.685 (95% CI: 0.654~0.716) and 0.703 (95% CI: 0.672~0.734) in predicting OS and CSS, which was significantly inferior to that of our nomogram. Besides, the Kaplan-Meier curves       7 BioMed Research International demonstrated that the AJCC stage could stratify patients between stages I~II and stages III~IV, whereas it was unsatisfactory in stratifying patients among stages I~II (Figures 4(a) and 4(b)). Therefore, the established nomogram had better risk stratification than the conventional AJCC stage.

Discussion
Not only is CDRCC a rare disease but also it has an aggressive biological behavior compared to the conventional RCC. The survival of CDRCC patients can widely vary, reflecting the prognostic heterogeneity associated with this disease [8-11, 15, 16]. An accurate prognostic model is critically important to inform patients about their long-term risk and guide the follow-up schedule. In the present study, two nomograms were proposed using the SEER database that can numerically predict the individual OS and CSS in patients with CDRCC based on clinicopathologic parameters and treatment modality. Patients could be divided into three risk groups according to the nomogram, with completely different survival prognoses. In the high-risk group, the median OS and CSS were only 5 mo, drastically shorter than those in the intermediate-and low-risk groups. Besides, our nomograms showed their superiority than the conventional AJCC stage system. Thus, the established nomograms could help urologists in performing a better risk stratification in patients with CDRCC.
Several larger series to date demonstrated the clinicopathological characteristics and prognosis of CDRCC patients [8-11, 15, 16]. Tokuda et al. [11] retrospectively analyze 81 cases from a multicenter in Japan and found that the lymph node and distant metastasis rate in these patients were 44.2% and 32.1%, respectively. Besides, the 1-, 3-, and

12
BioMed Research International 5-year CSS rates were only 69%, 45.3%, and 34.3%, respectively [11]. Karakiewicz et al. [16] analyzed 41 CDRCC and 5246 CCRCC, and their results revealed that CDRCC patients more often had higher tumor grade (G3+G4: 78% vs. 30%), advanced N (N1~2: 49% vs. 8%), and M stage (M1: 19% vs 14%), but cancer-specific mortality was not different between CDRCC and CCRCC after comparing the baseline data. In contrast, Wright et al. [8] and Sui et al. [9] demonstrated that CDRCC patients had not only advanced T, N, and M stages at the time of diagnosis but also adverse prognosis in comparison with CCRCC patients. Although the present study failed to compare the survival difference between CDRCC and CCRCC, the proportion of advanced T, N, and M stages and higher tumor grade in our study were higher or at least similar to those in the above results. Besides, it is worth noting than the 5-year OS and CSS rates in our study cohort were only 28.7% and 33.6%, respectively, which were drastically lower than the survival rates of CCRCC (73.2% and 84%) reported in previous studies [2,3].
The treatment modality of this rare disease is relatively difficult due to its aggressively biologic behavior. Surgery remains the mainstay option for most urologists, especially for patients who are in localized stages. The largest series suggested that those who underwent surgery have more survival benefits compared to those who did not (HR: 0.13, P = 0:005), and our findings are also consistent with those of [10]. The subgroup analysis revealed that those who were diagnosed with metastatic CDRCC and treated with cytoreductive surgery could also experience longer survival compared to those who did not (median survival: 4.4 mo vs. 1.5 mo), which is in agreement with another study [10,18]. Besides, many studies suggested that patients with CDRCC also benefit from chemotherapy [19][20][21]. In our study, chemotherapy was an independent protective factor associated with both OS and CSS. To our knowledge, conventional CCRCC is resistant to chemotherapy. However, CDRCC is derived from the distal nephron, sharing many similarities with urothelial tract carcinoma [22]. The gemcitabine

13
BioMed Research International +platinum (GC) regimen, which is the classic chemotherapeutic regimen for urothelial tract carcinoma, has been used in clinical practice as first-line adjuvant chemotherapy in patients with metastatic CDRCC. In a prospective phase II study of the GC regimen in 23 cases of metastatic CDRCC, the object response rate was 26% (1 complete and 5 partial responses) [19]. Another study analyzed 5 metastatic CDRCC receiving the bevacizumab+GC regimen and found a partial response in 3 cases, stable response in 1 case, and complete remission in 1 case [20]. A recent clinical trial enrolled 26 patients with metastatic CDRCC treated with the sorafenib+GC regimen and found that the object response rate was 30.8% [21]. In general, the GC-based chemotherapy regimen is not encouraging in the above studies. Several case reports suggested that metastatic CDRCC can potentially benefit from several treatments including cabozantinib [23], nivolumab [24], nivolumab+ipilimumab [25], personalized neoantigen-based immunotherapy [26], and HER2 blockade [27]. However, the evidence level of these therapies was relatively low. Pagani et al. [28] reviewed the literature and summarized the current treatment options and ongoing phase II clinical trials focusing on CDRCC. A phase II trial conducted in France enrolled 41 patients with metastatic CDRCC treated with the bevacizumab+GC regimen [29]. The trial was completed, but the results have not been reported. Another phase II trial conducted in Italy evaluating the activity and safety of cabozantinib as first-line treatment for metastatic CDRCC patients was also completed, and researchers were waiting for their results [30]. However, up to now, the management options of CDRCC continue to be investigated and evolve, since the optimal treatment remains unclear.
Several prognostic models had been established to predict the prognosis in patients with RCC [4][5][6]. However, these models were largely focused on CCRCC, neglecting the significant subset of patients with nonclear cell histology. In 2018, Leibovich et al. [3] established histology-specific prognostic models focusing on three major histologic subtypes (CCRCC, papillary RCC, and chromophobe RCC), but CDRCC was not included due to its rarity. As mentioned above, the clinicopathological characteristics and treatment modality of CDRCC are largely different from those of CCRCC. Thus, the prognostic model of CDRCC should also be unique. May et al. [10] were the first who developed a prognostic model based on the American Society of Anesthesiologists (ASA) score 3-4, tumor size greater than 7 cm, stage M1, Fuhrman grade 3~4, and lymphovascular invasion. Although the predictive accuracy was excellent, the model was developed based on only 95 cases and validated by 200 times bootstrap resample, which could cause overfitting to some extent. Another limitation of the study is the lack of the information regarding chemotherapy and radiotherapy, which also plays an important role in evaluating the prognosis in patients with CDRCC. Thus, our study was incorporated into the nomograms regarding not only the clinicopathological characteristics but also the treatment modality. Our hope is that the established nomograms could provide a more comprehensively prognostic evaluation for such rare disease.
This study has still many limitations. Firstly, the established models were based on a secondary analysis on a publicly available database. Previous studies suggested that CDRCC is difficult to be differentiated from other histologic subtypes like medullary RCC and urothelial papillary carcinoma [31,32]. Thus, the lack of centralized pathology review may cause misclassification in the study cohort, which limits the quality of the data. Secondly, the SEER database does not provide information on patients' comorbidity such as ASA score, Eastern Cooperative Oncology Group performance status (ECOG-PS), Karnofsky score, blood parameters, and details of the adjuvant chemotherapy regimen and the completion rate, which may also be associated with patients' prognosis. Thirdly, the nomograms were only validated using bootstrap validation due to the rarity of this population. Further studies are needed to externally validate the proposed nomograms.

Conclusions
In conclusion, this study investigated a relatively large cohort of CDRCC patients using the SEER database and analyzed the prognostic factors associated with prognosis. Finally, a survival prediction nomogram was described that can make an individualized evaluation of OS and CSS in patients with CDRCC, which could help urologists to perform a better risk stratification.

Data Availability
All the data in the current study are publicly available in the Surveillance, Epidemiology, and End Results database (https://seer.cancer.gov/).