Competitive Risk Model for Specific Mortality Prediction in Patients with Bladder Cancer: A Population-Based Cohort Study with Machine Learning

Background Noncancer death accounts for a high proportion of all patients with bladder cancer, while these patients are often excluded from the survival analysis, which increases the selection bias of the study subjects in the prediction model. Methods Clinicopathological information of bladder cancer patients was retrieved from the Surveillance, Epidemiology, and End Results (SEER) database, and the patients were categorized at random into the training and validation cohorts. The random forest method was used to calculate the importance of clinical variables in the training cohort. Multivariate and univariate analyses were undertaken to assess the risk indicators, and the prediction nomogram based on the competitive risk model was constructed. The model's performance was evaluated utilizing the calibration curve, consistency index (C index), and the area under the receiver operator characteristic curve (AUC). Results In total, we enrolled 39285 bladder cancer patients in the study (27500 patients were allotted to the training cohort, whereas 11785 were allotted to the validation cohort). A competitive risk model was constructed to predict bladder cancer-specific mortality. The overall C index of patients in the training cohort was 0.876, and the AUC values were 0.891, 0.871, and 0.853, correspondingly, for 1-, 3-, and 5-year cancer-specific mortality. On the other hand, the overall C index of patients in the validation cohort was 0.877, and the AUC values were 0.894, 0.870, and 0.847 for 1-, 3-, and 5-year correspondingly, suggesting a remarkable predictive performance of the model. Conclusions The competitive risk model proved to be of great accuracy and reliability and could help clinical decision-makers improve their management and approaches for managing bladder cancer patients.


Introduction
Bladder cancer is a malignancy that occurs in the bladder mucosa. In terms of incidence rate, it is only slightly lower than prostate cancer, making it the second most prevalent malignancy invading the urinary system. e Global Cancer Observatory (GLOBOCAN), a report produced by the World Health Organization, estimates that bladder cancer represents about 3% of all cancer diagnoses globally, with the highest proportion occurring in industrialized countries. In male, bladder cancer has been identified to be the sixth most prevalent malignancy [1].
e American Cancer Society reported that in 2022, there were approximately 81180 newly diagnosed cases of bladder cancer in the United States (U.S.), of which 61700 were male, ranking fourth in new cases of male cancer, and the estimated number of bladder cancercaused deaths was about 17100, of which 12120 were male [2]. Most of the bladder cancer cases are urothelial carcinomas in pathological classification, and a few are squamous cell carcinomas and other pathological types [3]. Bladder cancer can be categorized into two subtypes: myometrial invasive bladder cancer (MIBC) and non-muscle-invasive bladder cancer (NMIBC), with the latter attributed to roughly 75% of the patients. Half of the NMIBCs are of low pathological differentiation, whereas most of MIBCs are of high atypia [4,5]. Smoking and occupational exposure to carcinogens (such as aromatic amines, polycyclic aromatic hydrocarbons, and chlorinated hydrocarbons) are important risk factors for bladder cancer [6]. Bladder cancer can pose a heavy social and economic burden on patients and remains a great challenge for global public health [7]. Identification of risk factors for cancer death and noncancer death is crucial for individualized cancer treatment. Noncancer death is also an important cause of death in cancer patients [8]. Studies have found that multiple noncancer factors, including those concomitant with other cancers, circulatory diseases, nondisease causes, other noncancer diseases, and respiratory diseases, also largely contributed to the deaths of bladder cancer patients, and the proportion of noncancer causes is increasing. erefore, management of other complications is critical when treating bladder cancer patients [9]. Conventional prediction analyses for the prognosis, recurrence, survival, and mortality of bladder cancer depend upon the number of cancer sites, cancer size, recurrence rate, pathological types, in situ cancer, and TNM stages, which cannot provide predictions that are individualized, precise, and applicable for the patients [10,11]. As a statistical model-based prediction method, the nomogram has some remarkable merits compared with other approaches and could produce a more scientific prognostic prediction for bladder cancer patients [12,13]. e goal of this research was to build a competitive risk model for the prediction of noncancer deaths among bladder cancer patients and evaluate its predictive accuracy based on the SEER database, so as to provide a reference for clinical decisions.

Data Source.
We collected clinicopathological data from patients who registered in the SEER program of the National Cancer Institute. SEER is a publicly accessible database that collects clinical, demographic, and outcome data of patients with all types of malignancies, with 18 registries covering 30% of the U.S. population. All the data used in this study followed the specifications of the SEER database. No interventions or patients' privacies were involved in this study, so ethical approval and informed consent were not needed.

Data Exclusion Criteria.
Although SEER includes a large number of cancer records, but there are still many missing values in the registration process. Interpolation could be difficult when there is a large proportion of missing values. erefore, patients meeting the following criteria were excluded: (1) Age at diagnosis less than 18 years old (2) Marital status at diagnosis unavailable (3) Race unknown (4) Derived AJCC stage group coded as NA or UNK stage (5) Cancer stages unknown (6) Regional nodes positive (1988+) coded as 99 (7) CS tumor size (2004-2015) coded as 999 (8) Survival months was 0 (9) SEER combined mets at DX-bone (2010+) coded as unknown or N/A (10) SEER combined mets at DX-brain (2010+) coded as unknown or N/A (11) SEER combined mets at DX-liver (2010+) coded as unknown or N/A (12) SEER combined mets at DX-lung (2010+) coded as unknown or N/A After screening the collected data, 39285 patients were recruited for this analysis.

Data Analysis.
Patients who died of bladder cancer were set as events of interest, those who died of other causes were set as competitive events, and survival or loss to follow-up as deletion events.
Before modeling, nonrepetitive random sampling was conducted according to the common 7 : 3 ratio in the risk model to generate the training as well as the validation cohorts. First, the random forest method was used to calculate the importance of clinical variables in the training cohort, the variables with high importance were used in the subsequent study of the competitive risk model. Subsequently, the univariate competitive risk model was used in the training cohort, and multivariable analysis was carried out on variables with P values less than 0.1 [14]. en, variables with p < 0.1 in multivariate analysis were selected to construct the predictive nomogram of bladder cancerspecific mortality. A competitive risk model for bladder cancer-specific mortality was constructed, and a predictive nomogram was plotted.
To examine the prediction performance (accuracy) of the model, we utilized the c-statistic and calibration curve. e R 4.0.4 software (R Development Core Team, Vienna, http:// www.R-project.org) was utilized to execute all analyses of statistical data. e crr function in the algorithm integration package 'cmprsk' was utilized to complete the competitive risk model, whereas the cuminc tool was utilized to conduct the fine-gray test [15].

Clinical Characteristics of Included Patients.
ere were 108884 patients identified who received bladder cancer diagnoses between 2010 and 2015. After screening by eligibility requirements, 39285 patients were incorporated, with the longest follow-up period of 83 months and the median period of follow-up of 29 months. ere were 13086 patients who died, among whom 6571 died of bladder cancer and 6515 of other causes. Patients were allotted at random to either the training cohort (n � 27500) or the validation cohort (n � 11785). e patients had a mean age of 70.82 ± 11.84 years old. Among them, 24669 (62.79%) were married, 29573 (75.28%) males, and 35133(89.43%) were white.
ere were 37443 patients (95.31%) who had no regional lymph node biopsy or negative biopsy results. As for the primary cancer site, there were 12142 (30.91%) on bladder NOS and 9833 (25.03%) on the lateral wall of the bladder. ere were 18240 (46.43%) patients with grade IV cancer and 9797 (24.94%) with grade II. e number of papillary transitional cell carcinomas was 26737 (68.06%). For cancer stages, 16978 patients (43.22%) were in stage 0a or Ois, and 9963 patients (25.36%) were in stage I. For surgery of the primary site, 25288 patients (64.37%) underwent excisional biopsy. For cancer size, 15060 patients (38.34%) had tumor sizes of less than 3 cm. As for the number of tumors, 25415 patients (64.69%) were diagnosed with a single tumor. Table 1 presents the detailed clinical features of the patients who were included in the research.

Clinical Variables Determination by Random Forest in Train Cohort.
e RandomForest packages were applied to evaluate the importance of the clinical variables, and the random seed was set as 123. Age, primary, surgery, size, and grade were considered to be characteristic representative variables in the training cohort, and their importance parameters of random forest screening are shown in Figure 1.

Construction of the Competitive Risk Model.
e five clinical variables that affect the importance close to 0 in the random forest were excluded, and the univariate and multivariate analyses were performed on the remaining clinical variables. We constructed a univariate competitive risk model in the training cohort, and the results showed that all variables were statistically significant (p < 0.1). All parameters in this univariate model were incorporated into the multivariate competitive risk model, and the results showed that age at diagnosis, marital status at diagnosis, sex, primary site-labeled, grade, ICD-O-3 hist/behav, derived AJCC stage group 7th ed (2010-2015), derived AJCC T 7th ed (2010-2015), RX Summ-surg prim site (1998+), chemotherapy recode (yes, no/unk), CS tumor size (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015), and total number of in situ/malignant tumors for patients were independent risk indicators for bladder cancer-specific mortality. erefore, we incorporated the above variables to construct a competitive risk model for predicting tumor-specific mortality ( Table 2; Figures 2  and 3).

Model Validation.
e C index was adopted to assess the accuracy of training and validation cohorts in the model. In the training cohort, the model predicted that the overall C index of patients was 0.876, and the areas under the ROC curve (AUC) were 0.891, 0.871, and 0.853 correspondingly for 1-, 3-, and 5-year cancer-specific mortality. In the validation cohort, the overall C index of patients was 0.877, and the AUC values were 0.894, 0.870, and 0.847 correspondingly for 1-, 3-, and 5-year cancer-specific mortality, which indicated that the model was of great prediction performance. e details are shown in Figure 4 and Figure 5. In addition, the calibration curve of the model illustrated that the anticipated value of the model was almost identical to the actual observation value, illustrating the considerable accuracy of the model.

Discussion
Bladder cancer is a prevalent malignancy in the urinary system with its incidence rate only second to that of prostate cancer, presenting a great threat to public health. Currently, individualized cancer treatment has been taken more seriously, and accurate prediction of the survival, prognosis, and mortality of bladder cancer patients is of great importance. Conventional survival analyses (e.g., the Cox proportional hazards model and the Kaplan-Meier marginal regression) usually include only one endpoint event, such as death. However, when there are multiple events and these events compete with each other, the use of a single endpoint analysis would lead to deviations in the anticipated probability of endpoint events. In this research, we employed a competitive risk model to evaluate the risk variables that affect bladder cancer patients' prognoses. Although this model accounts for bladder cancer-related mortality, it also takes into consideration the deaths attributed to other forms of cancer as well as other events.
Nomograms for prognosis prediction of bladder cancer are drawing increasing attention recently. ere have been plenty of studies that applied nomograms to anticipate the overall survival (OS) of bladder cancer patients. Zhan et al. plotted a nomogram that was highly differentiated and accurate and constructed a relevant risk categorization system to anticipate the cancer-specific survival probability for MIBC patients who underwent partial cystectomy [16]. Zhan et al. also constructed a nomogram to provide an accurate prognostic prediction for cancer-specific survival probability in patients with lymph node-positive bladder cancer [17]. Tao [20]. Many other studies have also constructed nomograms of overall survival in patients with bladder cancer [21,22]. A nomogram to anticipate individual cancer-specific mortality was established in this research premised on a sizable cohort of bladder cancer patients from the SEER database.
e nomogram was constructed based on demographic, pathological, and surgical data, which showed remarkable effects in both the training and the validation cohorts, illustrating that the nomogram is of clinical applicability for predicting bladder cancer-specific mortality. Our model contained the following variables produced from clinical practice: age at diagnosis, marital status at diagnosis, sex, primary site-labeled, grade, ICD-O-3 hist/behav, derived AJCC stage group 7th ed (2010-2015), derived AJCC T 7th ed (2010-2015), RX Summ-surg prim site (1998+), chemotherapy recode (yes, no/unk), CS tumor size (2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015), the total number of in situ/malignant tumors for patients, in which age was a significant risk variable for the bladder cancer-specific mortality, suggesting that the risk of death in bladder cancer patients would increase significantly with age. Prognostic analyses for bladder cancer conducted by other studies also showed that age played a crucial role in cancer death, and the death rate in patients increased with the increase of age at diagnosis [23,24]. Another risk factor appeared to be sex. Female bladder cancer patients experienced a poorer prognosis in contrast with males, which was consistent with the results of other studies that found female patients with bladder urothelial carcinoma had higher cancer-specific mortality [25,26]. e cancer stage was also proved to be a substantial risk indicator for the prognosis of bladder cancer. Upgrading of the stage was associated with a worse prognosis, which was also consistent with many other studies. Cancer T staging was an important part of the model. Many research reports illustrated that bladder invasion depth was strongly linked to bladder cancer patients' prognoses. With the progress of T staging, cancer would be more invasive and progressive [27]. On the other hand, the prognosis of bladder cancer patients with distant organ metastasis was highly unfavorable in contrast with those without metastasis. Studies showed that different distant metastasis sites had different effects on mortality, and the mortality of patients with multiple distant metastasis sites was significantly higher than those with single distant metastasis sites [28,29]. Furthermore, the nomogram also illustrated that an increase in cancer size resulted in a poorer prognosis in bladder cancer patients. Other studies have further confirmed the influence of cancer size on bladder cancer-specific prognosis [30]. Moreover, the number of cancers is also related to the prognosis of bladder cancer. Patients with multiple bladder cancer had a poorer prognosis than those with single bladder cancer.            of multiple endpoints. It is an analytical method to deal with the survival data of multiple potential outcomes. If the clinical survival data have multiple outcomes and the hypothesis of "deletion independence" is not satisfied when there are competitive outcomes, the Cox proportional risk model cannot be used for multifactor analysis, otherwise the wrong hazard ratio will occur. At this time, the competitive risk model reflects its unique value. However, this study has some limitations. First, the prognosis of individuals with bladder cancer is likely influenced by their lifestyle, genes, and other factors; however, the SEER database did not include information on these variables. Second, only internal validation was performed for our model. Further validation of external clinical data and future clinical applications are needed. It is expected that future studies will be conducted in order to investigate other aspects.

Conclusion
We constructed a competitive risk model to anticipate cancer-specific mortality in bladder cancer patients, which proved to be of great accuracy and reliability and could help clinical decision-makers improve the management and follow-up methods for these patients.

Data Availability
e data that support the findings of this study are available from the corresponding author upon reasonable request.

Conflicts of Interest
e authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.