Cell Differentiation Trajectory Predicts Prognosis and Immunotherapeutic Response in Clear Cell Renal Cell Carcinoma

Clear cell renal cell carcinoma (ccRCC) is the main type of malignancy in kidney related to glucose metabolism. Primary single cell culture and single cell sequencing are novel research technologies. In this study, we explored the differentiation status of ccRCC cells and its significance in prognosis and immunotherapeutic response through bioinformatics. We characterized distinct differentiation states and differentiation-related genes (DRGs) in ccRCC cells through single cell RNA sequencing (scRNA-seq) analysis. Combined with bulk RNA-seq data, we classified patients into two clusters and found that this classification was closely correlated with patient prognosis and immunotherapeutic responses. Based on machine learning, we identified a prognostic risk model composed of 14 DRGs, including BTG2, CDKN1A, COL6A1, CPM, CYB5D2, FOSB, ID2, ISG15, PLCG2, SECISBP2, SOCS3, TES, ZBTB16, and ZNF704, to predict the survival rate of patients and then constructed a nomogram model integrating clinicopathological characteristics and risk score for clinical practice. In the study of immune checkpoints, we found that patients in the high-risk group had a disposition to get worse prognosis and better effects of immune checkpoint blocking therapies. Finally, we found the expression level of model DRGs was associated with a tumor-immune microenvironment (TIME) pattern and the response of 83 compounds or inhibitors was significantly different in the two risk groups. In a word, our study highlights the potential contribution of cell differentiation in prognosis judgment and immunotherapy response and offers promising therapeutic options for ccRCC patients.


Introduction
Clear cell renal cell carcinoma (ccRCC) is the main type of RCC, accounting for about 70% of adult clinical cases. ccRCC is characterized by the loss of von Hippel-Lindau and the accumulation of robust lipid and glycogen [1]. Local ccRCC can be detected early and successfully treated by surgery, while metastatic ccRCC is almost always fatal. Te lack of sensitivity to chemotherapy and radiation therapy has brought great trouble to clinicians and also brought huge burden to patients. Over the past decade, several targeted agents and immunotherapies have been added to the treatment plan of metastatic ccRCC [2][3][4]. Due to the high heterogeneity of ccRCC, previous classifcations cannot satisfactorily predict the prognosis of patients with the same diagnosis [5,6]. Furthermore, although countless prognostic models with signifcant genes have been constructed, the accuracy of their prediction performance still needs to be further confrmed and improved in clinical practice [7][8][9]. Immunotherapy, as a new therapy, has been widely used in multiple tumors [10]. Clinical practice has proved that immunotherapy had a good efect on most cancers and thus immunotherapy for ccRCC has been paid increasing attention by researchers [11,12]. However, due to the lack of accurate predictive biomarker selection, only a few ccRCC patients have achieved an efective immune response in clinical trials [13,14]. Terefore, it is urgent to construct an efective classifcation or biological prediction index to distinguish the prognosis and immunotherapy response of ccRCC patients.
Compared with the traditional bulk RNA sequencing technology, which can only refect the average variation level of tumor for all cells in the sample, the single-cell sequencing (scRNA-seq) technology has provided unprecedented molecular resolution for researchers to study cancer [15,16]. Tumor characteristics hidden in the heterogeneity of cell population could be revealed through single-cell genomics and trajectory analysis, which could ofer possible prognostic biomarkers for better individualized treatment [17,18]. However, the clinical samples of scRNA-seq research are limited and cannot be associated with clinicopathological data. In this case, the utilization of scRNA-seq data could be optimized by integrating bulk RNA-seq data. So far, few studies have focused on the construction of prognostic risk based with diferentiation-related genes (DRGs). It is also unclear whether the novel classifcation based on cell diferentiation trajectory is related to tumor biological behavior and whether cell differentiation plays a part in predicting the prognosis and immunotherapeutic response of ccRCC patients.
In this study, transcriptomic data of ccRCC samples were used to test our deduction. First of all, we used scRNAseq data to identify ccRCC cell subsets in diverse diferentiation states and signifcant DRGs through trajectory analysis. Second, we employed bulk RNA-seq data to classify ccRCC patients based on these DRGs and proved that this novel classifcation showed signifcantly diferent prognoses and immunotherapy responses. Tird, 14 DRGs were identifed as key genes-related ccRCC prognosis and a prognostic risk model was constructed by these DRGs. Next, we comprehensively made an exploration of TIME and drug sensitivity based on this 14-DRGs prognostic risk model. At last, a clinically applicable nomogram integrating clinicopathological characteristics and risk score was developed successfully for ccRCC patients.

Data Acquisition.
Te scRNA-seq data and bulk RNAseq data of ccRCC samples included in this study are available in the Cancer Genome Atlas (TCGA, https://portal. gdc.cancer.gov/) and the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/) database [19,20]. Bulk RNA-seq data and available clinical information were available for 519 samples in the TCGA-KIRC dataset and 39 samples in the GSE29609 dataset [21]. Te corresponding clinicopathological characteristics are listed in the Supplementary Tables 1 and 2. In addition, the scRNA-seq data were obtained from the GSE156632 dataset and contained 35,433 cells from tumor tissues of 7 ccRCC patients. Supplementary Table 3 presents the corresponding clinicopathological features.

2.2.
Processing of the scRNA-seq Data. Te "Seurat" package was employed for the scRNA-seq data processing, including quality control, data exploration, statistical analysis, and result visualization [22]. First, low quality cells were excluded according to the following quality control criteria: (1) genes detected in <500 cells; (2) cells with <1,000 or >20,000 detected genes; and (3) cells with >10% of mitochondrial expressed genes. Ten, the batch efects of the scRNA-seq data were corrected by "harmony" package [23]. Tird, the scRNA-seq data were normalized through "LogNormalize" method and subsequently, top 1,000 highly variable genes were identifed by "VST" package [24]. Next, principal component analysis (PCA) was used for dimensionality reduction of ccRCC cells to determine the signifcantly available dimensions (P < 0.05) [25]. Based on top 11 PCs, the uniform manifold approximation and projection (UMAP) algorithm was utilized for dimensionality reduction and clustering across all ccRCC cells [26]. Genes with the cutof criteria of adjusted P < 0.05 and |log 2 fold change (FC)| > 1 were regarded as the marker genes in each cluster through "limma" package [27]. Finally, according to the marker genes, these clusters were annotated using "singleR" package and manually verifed and corrected through the CellMarker database and references [28,29].

Trajectory Analysis and DRGs Identifcation.
Trajectory analysis could reduce high-dimensional representations to low-dimensional space by reducing master map learning. Cells were casted into this space and arranged as a trajectory with branch points. In addition, cells distributed in the same branch were considered to have similar diferentiation status and characteristics. Tese genes with diferent expression levels among branches were identifed and defned as diferentiation-related genes (DRGs). Terefore, this study analyzed the trajectories of renal tubular and cancer cells by the "Monocle 2" package [30] and the enrichment analysis for these DRGs in diferent branches was performed using "clusterProfler" package [31].

Classifcation for ccRCC Patients According to DRGs.
Te data of the TCGA-KIRC dataset were performed to make a transformation to transcripts per million (TPM) values and merged with GSE29609 dataset as training cohort [32]. Te training cohort was normalized with log 2 scale transformation and the batch efect was corrected by "SVA" package [33]. Tese processed expressions of DRGs were subsequently extracted for nonnegative matrix factorization (NMF). Ten, "survival" package was employed for the cox regression analysis to explore the correlation between the expression patterns of all DRGs and overall survival. DRGs with P < 0.01 were considered to be signifcantly associated with prognosis and selected for further analysis. Next, the unsupervised clustering method of NMF was carried out by "NMF" package and the optimal number of clusters is selected as the coexistence correlation coefcient [34]. Te K-M survival analysis was performed to predict the diverse prognosis of ccRCC patients under this novel classifcation [35]. Te proportion of main clinicopathological characteristics in each cluster was displayed as stacked histograms. PCA was subsequently conducted to show the result of DRGs clustering in diferent clusters. Finally, the gene set variation analysis (GSVA) method was utilized to analyze the diferences of molecular functions and pathways enriched in diferent clusters [36]. |log 2 FC| > 0.1 and adjusted P < 0.05 were considered to be signifcant. Te KEGG and ontology gene sets (c2.cp.kegg.v7. 5

Recognition of TIME and Immune Patterns According to
Novel Classifcation. ESTIMATE could use the unique characteristics of cancer tissue transcription spectrum to infer tumor cells and normal cells with diferent infltration [38]. Four indicators, including immune score, stromal score, ESTIMATE score, and tumor purity, were applied to identify TIME of each sample through "ESTIMATE" package. CIBERSORT, a deconvolution algorithm based on gene expression pattern, was employed to quantify the composition of cells involved in the immune response [39]. Te abundance of infltrating immune cells was measured to identify immune patterns patients in diferent clusters. Te main immune checkpoints of ccRCC patients were summarized from relevant studies [1,13,[40][41][42][43]. Moreover, diferential analysis was conducted to reveal the diferent expression levels of immune checkpoints. Te K-M analysis was performed to explore the association between immune patterns and patient survival. Diferent immunotherapeutic scores of ccRCC patients in diferent clusters were also calculated, and subsequently, the results were visualized through "ggplot2" package [44].

Construction and Validation of a Prognostic Risk Model
Based on DRGs. Te merged data consisting of TCGA-KIRC dataset and GSE29609 dataset were treated as the training cohort while the TCGA-KIRC dataset was treated as the validation cohort. First, WGCNA was utilized to identify modules related to the prognosis of ccRCC in the training cohort [45]. Subsequently, the univariate cox regression analysis was employed to select statistically signifcant DRGs (P < 0.01) and the prognostic value in the critical module was evaluated through "survival" package. Ten, the LASSO regression analysis was carried out for further selection of prognosis-related DRGs and a risk model with these DRGs was constructed to predict the prognosis of ccRCC patients [46]. Te risk score of each sample was calculated according to formula (1): In which "Exp" stands for the expression levels of DRGs and "β" represents the regression coefcients of DRGs. Based on the median risk score, all patients could be divided into two types: low-risk and high-risk. Te K-M survival analysis was performed to compare patient survival in the two risk groups. Te concordance index and the ROC curve analysis were applied to evaluate the accuracy of this risk model [47]. In addition, the validation cohort was also used to further verify the performance of this prognostic risk model in predicting 1-year, 3-year, and 5-year survival rates.

Development and Evaluation of a Prognostic Nomogram.
Univariate and multivariate cox regression analyses were performed in both the training cohort and the validation cohort to determine which were independent clinicopathological characteristics. Based on "rms" package, these independent characteristics and risk score were all used for the development of a prognostic nomogram for clinical practice [48]. Ten, the accuracy of this nomogram was identifed through calibration curves and discrimination performance was evaluated through C-index and ROC curves. Finally, this nomogram was validated in validation cohort.

Prediction of the Immunotherapeutic Response and Drug
Sensitivity. TIDE (https://tide.dfci.harvard.edu/) is a computational method that simulated tumor immune escape by combining with the expression patterns of T cell dysfunction and rejection [49]. Based on the preprocessing data, the TIDE algorithm was carried out to predict the clinical response of immune checkpoint blocking (ICB) in ccRCC patients. Furthermore, "pRRophetic" package was employed to estimate the half maximum inhibitory concentration (IC50) values of compounds or inhibitors as a reference for clinical chemotherapy and targeted therapy of ccRCC patients in diferent risk groups or clusters [50].

Identifcation of Cell Clusters Using scRNC-seq Data.
After the preprocessing of scRNA-seq data, including quality control, normalization, and batch efect correction, 32,400 cells from the GSE156632 dataset were included in the analysis (Figure 1(a)). Te number of genes detected was signifcantly correlated with the sequencing depth (R � 0.94, Figure 1(b)). Te dimensional reduction plot displayed the batch efect after correction (Figure 1(c)). Ten, 14,285 genes were identifed and top 1,000 genes were recognized as highly variable genes through variance analysis ( Figure 1(d)). Available dimensions were determined through a principal component analysis (PCA), and subsequently, related genes were identifed in each principal component (PC). Te dot plots and heatmaps showed the expression levels of 30 signifcantly related genes in 6 top PCs (Figures S1a-S1c). Cell cluster analysis was performed on 11 PCs with a P value <0.05 (Figures S1c and S1d).
Afterward, the UMAP algorithm was applied to classify 32,400 cells into 23 clusters (Figure 2(a)). Te top 5 differentially expressed marker genes of each cluster were visualized as a dot plot ( Figure S2a). According to these marker genes, the cells distributed in 23 clusters were annotated ( Figure 2 Figure 1: Preprocessing of the scRNA-seq data: (a) violin plots of the RNA information of processed scRNA-seq data, (b) scatter plot of the correlation between the numbers of detected genes and sequencing depth, (c) scatter plot of the batch efect after correction, and (d) scatter plot of 1,000 highly variable genes. 4 Genetics Research     Diferential analysis was performed to identify pseudotimedependent marker genes. Finally, a total of 715 marker genes were defned as DRGs and brought into the following analysis (Supplementary Table 4).

Classifcation for ccRCC Patients According to DRGs.
All ccRCC patients were divided into 2 clusters with the coexistence correlation coefcient (K � 2) by the NMF clustering analysis (Figures 3(a) and 3(b)). K-M survival analysis showed that patients in cluster 2 had worse overall survival compared with patients in cluster 1 (Figure 3(c)). PCA demonstrated that this classifcation could distinguish ccRCC patients signifcantly (Figure 3(d)). Patients in cluster 2 had the clinicopathological features of higher levels of age, grade, and stage, which was consistent with the survival analysis ( Figure 4(a)). Finally, diferential analyses of biological process, molecular function, cellular component, and pathway were performed on 2 clusters and the results manifested that, diferent from cluster 1, ccRCC of cluster 2 was mainly related to immune responses and tumor mechanisms (Figure 4(b)).
In general, the fndings mentioned above showed that this novel classifcation of ccRCC patients based on DRGs was reliable and could be useful to distinguish survival outcomes of diferent populations in clinical practice.

Recognition of TIME and Immune Patterns According to
Novel Classifcation. ESTIMATE algorithm calculated the diferent abundance of immune and stromal cells and tumor purity in 2 clusters. Compared with ccRCC patients in cluster 1, ccRCC patients in cluster 2 had the higher immune, stromal, and ESTIMATE score and lower tumor purity ( Figure 5(a)). Te K-M survival analysis explored the correlation of TIME and overall survival in 2 clusters and the results indicated that ccRCC patients in cluster 1 tended to have a better prognosis ( Figure 5(b)). Correlation analysis shows that both levels of infltrating immune cells and stromal cells were negatively related to the level of tumor purity ( Figure 5(c)). Based on the functional enrichment analysis of diferentially expressed genes between diferent tumor purity levels, we found that the main GO and KEGG terms were all related immune reaction ( Figure 5(d)). Moreover, the CIBERSORT algorithm was employed to make a further analysis of immune cell infltration. From the analysis results, Naive B cells, plasma cells, CD4 memory resting T cells, regulatory T cells, M0 macrophages, and neutrophils were signifcantly more abundant in cluster 2 while CD4 memory activated T cells, resting NK cells, monocytes, M1 macrophages, resting dendritic cells, and resting mast cells were signifcantly more abundant in cluster 1 ( Figure 5(e)). Patients with higher infltration of memory B cells, M0 macrophages, M1 macrophages, activated NK cells, plasma cells, CD8 T cells, follicular helper T cells, and regulatory T cells got worse overall survival while patients with higher infltration of activated dendritic cells, resting dendritic cells, eosinophils, M2 macrophages, resting mast cells, monocytes, and CD4 memory resting T cells got better overall survival ( Figure S3). According to the analysis of immune checkpoints, the expression levels of CD28, CD80, IL23A, and TNRSF18 were higher in cluster 2 patients ( Figure 5(f )). Patients with lower expression levels of CD80, CTLA4, IL23A, LAG3, PDCD1, TNFRSF9, TNFRSF14, and TNFRSF18 or higher expression levels of ARID2, BRD7, BTLA, CD274, HAVCR2, HLA-G, and PDCD1LG2 tended to have a better overall survival rate ( Figure S4). Finally, the diferent efects of anti-PD1 and anti-CTLA4 immunotherapies were estimated across 2 clusters ( Figure 5(g)). Te scores of each type immunotherapy in cluster 1 were signifcantly higher than those in cluster 2 and it indicated that cluster 1 patients were more likely to beneft from immunotherapy.  median risk score as the threshold, all patients can be divided into the two risk groups. Te association among data source, classifcation, risk score, and survival status is shown as a Sankey diagram ( Figure S5b). Patients in the cluster 2 had signifcantly higher risk scores than those in the cluster 1 ( Figure S5c). Te expression levels of SOCS3, ISG15, and COL6A1 were proportionate to the risk scores. It indicated these 3 DRGs may act as risk genes. On the contrary, other DRGs were regarded as protective genes. Te risk score had a negative correlation with the survival time and survival status of patients (Figures 6(e) and 6(f )).
It was clear that the risk score statistically correlated with grade, stage, survival time, and status ( Figure 6(g)).
Moreover, the expression levels of these 14 DRGs were signifcantly diferent between patients of the two risk groups ( Figure S6a). Enrichment analysis indicated functional signifcance of DRGs in ccRCC. (Figure 7(a)). Te K-M survival analysis demonstrated that patients with high-risk scores had a worse overall survival rate than those with a low-risk score either in the training or validation cohort (Figure 7(b)). Receiver operating characteristic (ROC) analysis manifested that this model showed excellent performance in predicting overall survival rate of ccRCC patients. Te areas under the ROC curves (AUC) to predict 1-year, 3-year, and 5-year overall survival were 0.802, 0.765, and 0.765 in training cohort, and 0.826, 0.790, and 0.790 in validation cohort (Figure 7(c)), respectively. Te efect of grade or stage as a subvariable to predict overall survival was also better ( Figure S6b). Comparing published prognostic risk models with our model, the accuracy of our model was proved to be better than others. In detail, the AUC value of the best model to

Development and Validation of a Prognostic Nomogram.
In the training cohort, the univariate cox analysis showed that age, grade, stage, and risk score all had a prognostic value. Te multivariate cox analysis indicated that all variables can be independent features to predict the prognosis of ccRCC patients (Figure 8(a)). Subsequently, the same results were obtained from the validation cohort (Figure 8(b)). Ten, a prognostic nomogram integrating age, grade, stage, and risk score was developed to ofer a clinically applicable method for the prediction of individual prognosis (Figure 8(c)). Te ROC curves showed an excellent ability of this model for the prediction of 1-year, 3-year, and 5-year overall survival rates in the training cohort and the AUC values were 0.875, 0.843, and 0.801, respectively. Te calibration curves for predicting 1-year, 3-year, and 5-year overall survival were also close to the actual observations (Figure 8(d)). Of course, the same analysis was conducted in the validation cohort (Figure 8(e)).

Prediction of the Immunotherapeutic Response and Drug
Sensitivity. Correlation analysis manifested the expression of 14 model DRGs was signifcantly correlated with the abundance of infltrating immune cells in both the training and validation cohorts. In particular, ISG15 had a diferent TIME pattern from other DRGs. (Figures 9(a) and 9(b)). Compared with ccRCC patients with a low-risk score, patients with high-risk score tended to have a better respond to immunotherapy. Te similar results were also obtained from the validation cohort (Figures 9(c) and 9(d)). Furthermore, the response of 83 compounds or inhibitors was signifcantly diferent in the two risk groups, in which 31 compounds or inhibitors had a better drug response in the high-risk group while 52 compounds or inhibitors had a better drug response in the low-risk group (Supplementary Table 7). Meanwhile, a total of 70 compounds or inhibitors were signifcantly diferent in 2 clusters, in which 41 compounds or inhibitors had a better drug response in cluster 1 (Supplementary Table 8).

C1
C2 cluster (f ) Genetics Research status of ccRCC cells is associated with the prognosis and therapy response [1,54]. In this study, we employed the scRNA-seq data in GEO database to reveal the diferentiation status of ccRCC cells. Based on novel classifcation, ccRCC patients could be divided into 2 clusters with diverse clinicopathological characteristics. At the same time, TIME, immune gene expression, and immunotherapeutic response represented signifcant diference in 2 clusters. Subsequently, a risk model composed of 14 prognostic DRGs was established to predict the prognosis of ccRCC patients and a nomogram model integrating clinicopathological characteristics and risk score was constructed for clinical practice. Finally, we compared the immunotherapeutic response and drug sensitivity of ccRCC patients in the two risk groups to explore the possibility of clinical therapy. Intratumor heterogeneity is characterized by cells with diferent features in a single tumor. Tese cells show different cell collections with diferent molecular characteristics or diferentiation status [55]. A total of 23 cell clusters were identifed and subsequently 10 cell types were obtained through annotation. In view of the fact that most cancer cells were considered to be derived from renal tubular epithelial cells, we chose renal tubular cells and cancer cells for differentiation trajectory analysis. Te diferentiation trajectory showed that renal tubular cells was the root of diferentiation and then diferentiated into 2 diverse branches representing 2 diferent types of cancer cells. According to the expression patterns of DRGs, a novel classifcation with diferent clinicopathological characteristics was performed on ccRCC patients. Te association between classifcation and diferentiation status indicated that the prognosis and immunotherapy response were related to the cell diferentiation status. Te study of DRGs recognition could be helpful to better understand the occurrence and development of ccRCC. Numerous studies have showed that cellular signaling pathways and transcriptional cascades involved in diferentiation process were associated with the occurrence and development of malignant tumors [56][57][58]. Diferentiation therapy provided a new idea for the therapy of malignant tumors which induced cancer cells by transforming signal events and then guided them to a status of higher diferentiation and lower malignancy [59,60]. Although great progress has been made in the diferentiation therapy in ccRCC, the specifc molecular mechanism and therapy targets needed to be further studied [61,62]. In this study, we identifed prognosis-related DRGs to provide more reference for clinical therapy. So far, few studies have focused on the correlation between diferentiation status and TIME in ccRCC. In our study, patients in cluster 2 tended to have a higher level of infltrating immune cells and lower level of tumor purity compared with patients in cluster 1. Moreover, patients in cluster 2 were sensitive to immunotherapy and it was consistent with the result that patients with high risk tended to have a better immunotherapeutic response. From the published studies, we found that 5 model DRGs were proved to be associated with ccRCC. In detail, Sima et al. have investigated the impact of BTG2 on growth, migration, and invasion of ccRCC cells and found overexpressed BTG2 could inhibit proliferation, migration, and invasion of ccRCC cells [63]. PANDAR, promoter of CDKN1A antisense DNA damage activated RNA, had signifcantly upregulated expression in tumor tissues and could serve as an independent predictor of overall survival in ccRCC [64]. Moreover, Zhu et al. studied the therapeutic potential of LSD1 inhibitors in ccRCC treatment and discovered that inhibition of LSD1 could decrease the H3K4 demethylation at the CDKN1A gene promoter and it was associated with P21 upregulation and cell cycle arrest at G1/S in ccRCC cells [65]. Te COL6A1 was a gene encoding the alpha 1 polypeptide subunit of collagen 6 and ccRCC patients were discovered to have signifcantly higher COL6A1 scores and intensities [66]. Like us, Wan et al. included ISG15 as one of the prognostic predictors in a constructed risk model of ccRCC [67]. Urbschat et al. observed         signifcantly lower SOCS3 messenger RNA levels in tumor tissues compared to healthy tissues and concluded SOCS3, as a negative regulator, participated in regulation of ccRCC together with STAT3 [68]. At present, it was the frst time to fnd other model DRGs, including CPM, CYB5D2, FOSB, ID2, PLCG2, SECISBP2, TES, and ZBTB16, were related to ccRCC, which needed further study. To sum up, all fndings emphasized the possibility to predict TIME and immunotherapeutic response of ccRCC patients based on prognostic DRGs. Compared to the AUC value of other prognosis risk models, our model showed a higher accuracy to predict the prognosis of ccRCC patients. Te TIDE analysis showed that patients with high risk responded better to immunotherapy than patients with low risk. It indicated that risk score could also be applied as an indicator for the prediction of immunotherapeutic response. In addition, we found that the response of 83 compounds or inhibitors was signifcantly diferent in the two risk groups which could be used as a reference for clinical therapy. We focused on 32 compounds or inhibitors showing better response in the high-risk group. Fortunately, several results of compounds or inhibitors were consistent with published studies. For example, in the presence of AKT inhibitor VIII, a pan-AKT inhibitor, ART reduced more ccRCC cell proliferation, migration, and invasion than in the absence of AKT inhibitor VIII [69]. AZD6482 selectively inhibited migration, invasiveness, and colony formation of ccRCC cells with SETD2 mutations [70]. In the xenotransplantation model of mice, AZD8055 achieved signifcantly better tumor growth inhibition and prolonged survival time of mice than sirolimus or excipients [71]. Gao et al. have provided evidence to elucidate that miR-200c could sensitize ccRCC cells to sorafenib or imatinib to inhibit cell proliferation, at least partly by targeting HO-1 [72]. von Hippel-Lindau (VHL) gene mutation was the driving force of various forms of ccRCC and MG-132 mediated proteasome inhibition could make VHL wild type cells sensitive to zafrlukast-induced cell death [73]. Te synergistic efect of sAXL with pazopanib and axitinib could reduce the growth of xenograft derived from ccRCC patients, which supported the combination of AXL inhibitors and antiangiogenic agents in the treatment of ccRCC [74]. Tapsigargin had the highest performance in the treatment of early metastatic ccRCC and could be used as an efective small molecule drug to treat early metastatic ccRCC [75]. Meanwhile, the role of other compounds or inhibitors in ccRCC needed for further confrmation.
Te current study had some drawbacks. First of all, all data were obtained from the published database rather than our own dataset. Tus, the detailed clinical information was incomplete and could not be included in the nomogram model. Given that the lack of available data, more validation should be performed on other ccRCC cohorts. On the other hand, the specifc mechanism of most DRGs was not clear in ccRCC and our study needed to be further verifed through cellular biological experiments.

Conclusion
Tis study highlighted the cell diferentiation trajectory of ccRCC cells and manifested a potential impact on the prediction of the prognosis and immunotherapeutic response in ccRCC patients. In detail, a novel classifcation of ccRCC patients was constructed and proved to be reliable in the prediction of diverse prognosis, TIME pattern, and immunotherapeutic response. Cell diferentiation-related genes were identifed; then, a prognostic risk model with these genes was constructed to predict the prognosis and immunotherapeutic response of ccRCC patients with different risk scores. We also established a nomogram composed of clinicopathological characteristics and risk score for diagnosis and estimated the drug sensitivity of ccRCC patients with diferent risk scores for treatment.

Statistical Analysis Methods.
Te comparison in multiple groups was performed using the Kruskal-Wallis test and the comparison between the two groups was based on Wilcoxon test. Te Pearson correlation test was used to study the correlation between normally distributed variables while the correlation between nonnormally distributed variables was evaluated by using the Spearman correlation test. Te Chi square test was used to analyze the distribution of categorical variables among subgroups, and the Student's t-test was used to compare continuous data between the two subgroups. In the K-M analysis, the log rank test was carried out to examine statistical diference. All data analysis and visualization were completed using R version 4.0.3.