Analysis of Bladder Cancer Staging Prediction Using Deep Residual Neural Network, Radiomics, and RNA-Seq from High-Definition CT Images

Bladder cancer has recently seen an alarming increase in global diagnoses, ascending as a predominant cause of cancer-related mortalities. Given this pressing scenario, there is a burgeoning need to identify effective biomarkers for both the diagnosis and therapeutic guidance of bladder cancer. This study focuses on evaluating the potential of high-definition computed tomography (CT) imagery coupled with RNA-sequencing analysis to accurately predict bladder tumor stages, utilizing deep residual networks. Data for this study, including CT images and RNA-Seq datasets for 82 high-grade bladder cancer patients, were sourced from the TCIA and TCGA databases. We employed Cox and lasso regression analyses to determine radiomics and gene signatures, leading to the identification of a three-factor radiomics signature and a four-gene signature in our bladder cancer cohort. ROC curve analyses underscored the strong predictive capacities of both these signatures. Furthermore, we formulated a nomogram integrating clinical features, radiomics, and gene signatures. This nomogram's AUC scores stood at 0.870, 0.873, and 0.971 for 1-year, 3-year, and 5-year predictions, respectively. Our model, leveraging radiomics and gene signatures, presents significant promise for enhancing diagnostic precision in bladder cancer prognosis, advocating for its clinical adoption.


Introduction
Cancer of the urothelium lies between the renal pelvis and the urethra, accounting for approximately 3 percent of all cancer-related deaths in the United States [1].Two diferent risk factors are associated with this disease that most commonly occur in the bladder [2].In the Western world, smoking and exposure to environmental and industrial carcinogens pose the most serious health risks [3].Tere are two distinct but somewhat overlapping pathways in the development of bladder cancer, termed papillary and nonpapillary, corresponding to two distinctly clinical and pathogenetically distinct types [4].Tere are approximately 80% of bladder neoplasms that are superfcial papillary lesions caused by difuse mucosal hyperplastic changes known as low-grade urothelial neoplasia [5].It is still difcult to predict the outcome for patients with advanced or chemotherapy-resistant bladder cancer, despite progress in surgical techniques and drug therapy [6].
For the diagnosis of bladder cancer, computed tomography (CT) is the most common method [7].When bladder cancer is diagnosed preoperatively, preoperative staging can be more accurate, and recurrence can be detected earlier after surgery with preoperative diagnosis [8].Te acquisition of traditional medical CT examination images, on the other hand, requires a great deal of time and space since more information is contained in them about the human tissues [9].It is therefore not only more difcult but also more expensive to segment CT images using CT image segmentation technology [10].In medical CT images, it is easy to cause mis-segmentation, especially when body tissues have abnormalities, such as severely damaged tissues [11].
Over the past few years, radiomics has gained more and more attention.Medical images are converted into highdimensional, mineable data using high-throughput quantitative feature extraction, followed by data analysis to support decision-making [12].As pattern recognition tools and dataset sizes have grown, radiomics has made progress, which may improve oncology prediction accuracy [13].A number of previous studies have demonstrated the potential of objective and quantitative imaging descriptors as prognostic and predictive biomarkers [14].
An RNA-sequencing study measures the mRNA, small RNA, noncoding RNA, and other expression levels in a transcriptome by using the high-throughput sequencing technology.Since the early 2000s, the RNA-Seq technology has grown rapidly and has become one of the most essential tools for analyzing transcriptome-wide gene expression changes and alternative splicing of mRNAs.It has become possible to apply RNA-Seq technology to a broader range of applications with the development of next-generation sequencing technology.Multiple biomarkers from the RNAsequencing study can guide the diagnosis and treatment of cancer patients.As a marker powerful enough to transform clinical management, a panel of biomarkers rather than their individual analyses provides the most promising approach.Terefore, in this work, we aim to construct the model based on the RNA-sequencing analysis and radiomics for the better prediction of the prognosis and the treatment of bladder cancer patients.In addition, we also evaluate the immune cell infltration analysis based on the combination of RNA-sequencing analysis and radiomics.Te GO and KEGG enrichment analysis was applied to explore the potential pathways.

Imaging Data of Patients with Bladder
Cancer.Te TCIA website (Te Cancer Imaging Archive) hosts large volumes of cancer medical images that are deidentifed and made publicly available for download.Te data are organized into "collections," such as patient imaging data associated with one disease (such as lung cancer), type of image (such as MRI and CT), or research topic (such as digital histopathology).DICOM is the main fle format utilized by TCIA for radiology images.Furthermore, there are supplementary data available, including patient results, specifcs of treatment, genomics, and expert evaluations.Tis study acquired 82 CT scans of patients with bladder cancer from the TCIA dataset, which is associated with the TCGA database.

2.2.
Te RNA-Sequencing from the TCGA Database.Te study collected RNA-sequencing data and relevant clinical data on bladder cancer from Te Cancer Genome Atlas (TCGA) database.

Image Segmentation.
Te segmentation of CT images, a critical step in our study, was performed by using a semiautomated method to delineate the regions of interest (ROIs) corresponding to the bladder cancer tumors.Each ROI was carefully reviewed and adjusted by two experienced radiologists to ensure accuracy, with discrepancies resolved by consensus.

Feature Extraction.
Following segmentation, radiomics features were extracted from the delineated ROIs using "PyRadiomics" for Python.Tis comprehensive feature extraction process involved calculating a variety of features, including shape, intensity, texture, and wavelet-based features, to capture the tumor's phenotypic characteristics.Te feature extraction parameters were set as follows: list key parameters, e.g., "bin width � 25 and resampling voxel size � 1 × 1 × 1 mm 3 ," based on best practices in the literature to ensure robustness and reproducibility of the feature set.

Feature Preprocessing and Selection.
A feature preprocessing process consists of two steps: step 1 is to remove outliers and nulls and Step 2 is to normalize values in order to remove the dimensionality efect.Te selection of features is one of the most crucial steps for better generalizing models, since high-dimensional data are often cluttered with irrelevant features, which can cause overftting.Consequently, the variable space becomes simpler, and the variables are independent of one another.Te fnal step is to construct radioactive features based on selected features by using AdaBoost cross-validation with leave-one-out.

Diferentially Expressed Analysis in Bladder Cancer
Cohort.Te Limma package of the R language was used to analyze diferential expression.An adjustment was made to the P values in TCGA in order to correct for false positives.To identify variations in mRNA expression, a threshold of "adjustable P < 0.01 and log 2 (fold change) >2 or log 2 (fold change) <−2" was utilized.While a log 2 fold change (FC) of 1 (equivalent to a twofold change) is commonly used to denote statistical signifcance, we opted for a more stringent threshold to ensure the biological relevance of our fndings.A log 2 FC greater than 2 or less than −2 indicates a fourfold change in expression, highlighting genes with potentially greater biological impact and reducing the likelihood of identifying changes due to random variation or minor fuctuations in gene expression.

Te Pathway Enrichment Analysis.
Te data underwent a functional enrichment analysis to validate the potential roles of the targets.GO (Gene Ontology) is a widely used tool to annotate genes with functions, such as molecular functions, biological pathways, and cellular components.Examining KEGG enrichment is a useful method for understanding gene function and genome function at a broad level.Enrichment analysis using GO and KEGG was conducted in the R programming environment.

Construction of Radiomics Signature through Feature
Selection.Te lasso technique was employed for regression analysis on data with many variables to identify the most 2 Genetics Research valuable predictive characteristics from the initial dataset.
Te selection of the regularization parameter, λ, in lasso regression is critical as it determines the extent of the penalty applied to the features.To select an optimal λ, we utilized a cross-validation approach, specifcally the 10-fold crossvalidation method.Tis method involves dividing the dataset into ten parts, training the model in nine parts, and validating it in the remaining part.Tis process is repeated ten times, with each part serving as the validation set once.Te optimal parameter was selected based on the λ value that produced the lowest cross-validation error.Tis approach ensures that the chosen λ is not only efective in minimizing the prediction error but also prevents overftting by not overly penalizing the model, thereby preserving the predictive power of important features.

Creating a Personalized Forecasting Algorithm.
To create a personalized prediction model using clinical information and RNA expression data, we conducted Cox regression analysis.First, a single-variable Cox regression analysis was conducted to identify characteristics with possible prognostic signifcance.Variables with a signifcance level below 0.05 in this initial examination were taken into account for incorporation in the multivariable model.Subsequently, we employed a stepwise selection process, considering both forward selection and backward elimination, to refne the list of variables included in the fnal model.Tis approach ensured that the fnal model contained only variables that signifcantly contributed to the prediction of patient outcomes, thereby enhancing the model's specifcity and generalizability.

Statistical Analysis.
Statistical analyses were performed in R, making use of its "radiomics" package for feature extraction and "survival" for survival analysis.Python was used for image processing tasks, by employing the PyRadiomics library for extracting radiomic features and scikitimage for image segmentation and preprocessing.Te survival package in R allows for the execution of Kaplan-Meier survival analysis and log-rank tests.Log-rank tests were carried out to assess the statistical discrepancies in survival probabilities depicted in the Kaplan-Meier curves.Tis technique involves comparing the actual survival results with the predicted results assuming there is no distinction between the groups.A p value below 0.05 was deemed to be statistically signifcant.Te relationship between the risk of survival and HR was assessed through the Spearman correlation test and the Cox proportional hazards model.We determined statistical signifcance by performing a rank sum test on the two datasets.A p value below 0.05 was deemed to be statistically signifcant.To mitigate the risk of false positives, we applied the Bonferroni correction method.Tis method entails modifying the importance level by dividing the standard p value of 0.05 by the total number of tests conducted.

Te Basic Information of 82 Bladder Cancer Patients.
In this work, a total of 82 bladder cancer patients were involved in the analysis from the TCGA dataset.Out of the group, there were 28 individuals with bladder cancer who were younger than 65 and 54 individuals with bladder cancer who were older than 65.In addition, a total of 20 bladder cancer patients were female and a total of 62 bladder cancer patients were male.All the bladder cancer patients were involved in high grade.In terms of stage, there were 27 patients with bladder cancer in stage II, 31 patients with bladder cancer in stage III, and 24 patients with bladder cancer in stage IV.For the T stage, 1 bladder cancer patient was involved in the T0 stage, a total of 6 bladder cancer patients were in the T2 stage, a total of 8 bladder cancer patients were in the T2a stage, a total of 14 bladder cancer patients were in the T2b stage, a total of 10 bladder cancer patients were in the T3 stage, a total of 16 bladder cancer patients were in the T2a stage, a total of 13 bladder cancer patients were in the T3b stage, 1 bladder cancer patient was in the T4 stage, a total of 8 bladder cancer patients were in the T4a stage, and a total of 5 bladder cancer patients were in the unknown T stage.In terms of the N stage, a total of 47 bladder cancer patients were in the N0 stage, a total of 9 bladder cancer patients were in the N1 stage, a total of 14 bladder cancer patients were in the N2 stage, a total of 11 bladder cancer patients were in the NX stage, and 1 bladder cancer patient was in the unknown N stage.For the M stage, a total of 42 bladder cancer patients were in the M0 stage, a total of 3 bladder cancer patients were in the M1 stage, and a total of 37 bladder cancer patients were in the MX stage.Figure 1 introduces the process of this work in detail.

Results from Imaging Using Computed Tomography in
Patients with Bladder Cancer.In the frst step, all CT imaging results of 82 bladder cancer patients were uploaded to the 3D Slicer software, which allows to visualize, process, segment, register, and analyze medical, biomedical, and other 3D images and meshes for free and open source.Here, the CT of two patients with bladder cancer is shown in the fgure.Figure 2 shows a male bladder cancer patient with high grade, 65 years old, stage III, T3a stage, N0 stage, and M0 stage (Figures 2(a) and 2(b)).In addition, Figure 3 shows a male bladder cancer patient with high grade, 64 years old, T2a stage, M0 stage, and N0 stage (Figures 3(a) and 3(b)).In the next step, we outline the ROI (region of interest), as well as each slice of the CT, so that we can extract the three-dimensional features of each bladder cancer image.Te main component of the tumor is believed to be its ROI.Te tumor tissue was reconstructed using a 2 mm dilation algorithm in this study.An NRRD format fle will be produced for each sample after it is sketched and exported (Figures 2(c

Te Genes Tat Are Expressed at Varying Levels in the
Bladder Cancer Cohort.To investigate the genes closely linked to bladder cancer, we conducted diferential expression analysis comparing the normal group with the bladder cancer cohort.Te fndings indicated that 546 genes were identifed as diferentially expressed, with 136 genes showing an increased expression and 410 genes showing a decreased expression (Figure 6(a)).Te heatmap illustrated the genes that were expressed diferently in the bladder cancer group compared to the normal group (Figure 6(b)).

Te Potential Routes Linked to Genes with Varying Expression Levels in the Bladder Cancer Group.
We then assessed the possible routes linked to the genes that are expressed diferently.In KEGG enrichment analysis, the pathways with the highest enrichment of upregulated genes are the p53 signaling pathway, viral carcinogenesis, platinum drug resistance, and oocyte meiosis as shown in Figure 7(a).
Te pathways that are most enriched in downregulation include cGMP-PKG signaling, vascular smooth muscle contraction, TNF signaling, and tryptophan metabolism (Figure 7(b)).Te pathways most upregulated for GO enrichment analysis include sister chromatid segregation, regulation of sister chromatid segregation, regulation of nuclear division, and regulation of mitotic sister chromatid segregation (Figure 7(c)).Furthermore, the pathways with the most downregulated activity include the development of striped muscle tissue, the regulation of blood vessel development, the regulation of muscle system processes, and the regulation of muscle contractions as shown in Figure 7(d).

Te Integrated Predictive Model Based on the Radiomics
Signature and Gene Signature.After analyzing the previous data, we were able to identify the genes that were expressed diferently in the bladder cancer group.Following this, an investigation will be conducted to identify the genes that have a strong correlation with the survival outlook of individuals with bladder cancer.Initially, we conducted a univariate Cox regression analysis which revealed that 8 genes with diferential expression were linked to the prognosis of individuals with bladder cancer (Figure 8(a)).Ten, the lasso regression analysis was performed to further explore the prognosis-related genes.Te multivariate Cox regression analysis revealed that four genes could potentially

Discussion
According to GLOBOCAN statistics, bladder cancer makes up 3% of global cancer cases and is especially common in developed nations [15].According to data from the United States, bladder cancer ranks as the sixth most prevalent type of cancer [16].Individuals aged 55 and above account for 90% of bladder cancer cases, with men being four times more susceptible to the disease compared to women [17].In the United States, the 5-year survival rate for patients with metastatic cancer is only 5%, which is signifcantly lower than the overall 5-year survival rate of 77%.Terefore, it is very urgent to explore the promising biomarkers for the better prediction of bladder cancer and for seeking diagnosis and treatment for the bladder cancer patients [18].With the development of multiple images, cancer can be diagnosed easily [19].A CT scan is a frequently used technique for diagnosing bladder cancer [20].Early detection of bladder cancer before surgery can lead to accurate preoperative staging and early detection of recurrence postoperatively.Conventional CT scans necessitate additional storage and processing time as they encompass  Genetics Research extensive details regarding the human anatomy [21].Hence, proper segmentation of CT images is crucial for improving the diagnosis and treatment of individuals with bladder cancer [22].In this work, we frst obtained the CT images of 82 bladder cancer patients.Ten, we outlined the ROI and each CT slice in order to extract the three-dimensional features from each image of bladder cancer.Finally, we successfully constructed the radiomics signature.boosting) for the cross-validation in feature selection, our decision was motivated by several factors that align with the goals of our study.AdaBoost is renowned for its capacity to convert a series of weak classifers into a strong classifer, making it particularly suitable for our dataset where the predictive power of individual features might be modest [24].Trough iterative processes, this algorithm rectifes errors made by weak classifers and modifes the weights of misclassifed instances, thereby enhancing the model's capacity to generalize beyond the training data to new data [25].Furthermore, due to advancements in next-generation sequencing technology, RNA-Seq technology can now be used in a wider range of applications.RNA sequencing can provide cancer patients with a variety of biomarkers to aid in their diagnosis and treatment.Following this, we conducted an analysis of genes with varying expression levels and developed a prognostic prediction model based on 4 genes using Cox and lasso regression techniques.Te integration of clinical, radiomics, and gene data represents a signifcant advancement in developing a multifaceted prognostic model for bladder cancer.However, the interplay between these data types is complex and warrants further discussion.Clinical data provide a foundational understanding of patient health and disease characteristics, while radiomics and genetic data ofer deeper insights into the tumor's phenotypic and molecular landscape.While our study highlights the potential of CT technology and deep learning algorithms in improving the accuracy of bladder cancer staging, we recognize the challenges in generalizing these fndings universally.Variations in healthcare infrastructure, access to advanced diagnostic tools, and population genetics can infuence the applicability and efectiveness of these technologies.Furthermore, the diversity in patient demographics underscores the need for models that are robust across diferent ethnicities, ages, and genders.In order to increase the generalizability of our fndings, upcoming research should strive to incorporate a wider range of patients and take into account the diferences in healthcare delivery systems.Tis method not only improves the applicability of the results but also ofers a more thorough understanding of the possible obstacles and aids in incorporating these technologies in diferent situations.It is crucial for advancing personalized medicine and ensuring that innovations in cancer diagnosis and treatment are accessible and efective for all segments of the population, regardless of geographical or socioeconomic status.Moreover, the integration of these technologies into clinical practice involves overcoming regulatory, ethical, and logistical hurdles.It is imperative to conduct further research to validate these approaches in diverse populations and settings and to continuously monitor their performance in real-world clinical scenarios.
In conclusion, it has been demonstrated that deep learning can be applied to CT images of bladder cancer to efectively segment lesions.CT images based on algorithmic algorithms are signifcantly more accurate than ordinary imaging examinations for staging bladder cancer.In addition, it was discovered that bladder cancer tissue harbors genes associated with prognosis, which can efectively forecast patient outcomes.Te prognostic prediction model, based on radiomics signature and gene signature, efectively forecasts the outcome for individuals with bladder cancer.

Figure 1 :Figure 2 :
Figure 1: Te mechanism diagram shows the process of analysis.

Figure 3 :
Figure 3: (a-b) Te image shows the CT of a 64 years old bladder cancer patients; (c-d) the Identifcation of ROI in CT image.

Figure 4 :
Figure 4: (a, b) Te lasso regression analysis was applied to select the imaging features of CT.

Figure 5 :
Figure 5: (a) Te forest plot demonstrated the imaging features in the train set, (b) the ROC curve demonstrated the predictive value of the radiomics signature in the train set, (c) the nomogram of radiomics signature in the train set, (d) the calibration curve reveals the predictive value of the nomogram in the train set, (e) the forest plot demonstrated the imaging features in the test set, (f ) the ROC curve demonstrated the predictive value of radiomics signature in the test set, (g) the nomogram of radiomics signature in the test set, and (h) the calibration curve reveals the predictive value of the nomogram in the test set.
Obtaining biomarkers with various indicators, including clinical characteristics, radiomics signature, and gene signature, is crucial due to their strong predictive value in cancer patients.In this study, we have efectively developed a prognostic prediction model by integrating the radiomics signature with the gene signature.Te nomogram, along with the ROC curve, further confrmed the model's precision.Te ability to accurately stage bladder cancer with CT technology will gradually improve as the technology continues to be updated and improved.As deep learning algorithms become more sophisticated, CT scanning may become a routine screening for bladder cancer in the future.It is believed that radiomic features will also become an efective target for bladder cancer treatment in the future, since these features are closely related to prostate cancer development and prognosis.For KEGG enrichment analysis, the increase in activity of the p53 signaling pathway, viral-induced cancer development, resistance to platinum drugs, and pathways related to oocyte maturation indicate an intricate interaction involving genetic changes, environmental infuences, and resistance to chemotherapy in bladder cancer.Tese pathways are pivotal in cell cycle regulation, DNA damage response, and apoptosis, indicating their critical roles in tumor development and response to treatment.On the other hand, the suppression of pathways such as cGMP-PKG signaling, vascular smooth muscle contraction, TNF signaling, and tryptophan metabolism could indicate the tumor microenvironment's adjustment, facilitating tumor development and avoiding immune detection.

Figure 6 :
Figure 6: (a) Te volcano map demonstrated the diferentially expressed genes in the bladder cancer cohort and (b) the heatmap demonstrated the diferentially expressed genes in the bladder cancer cohort.