Single-Cell and Transcriptome-Based Immune Cell-Related Prognostic Model in Clear Cell Renal Cell Carcinoma

Traditional studies mostly focus on the role of single gene in regulating clear cell renal cell carcinoma (ccRCC), while it ignores the impact of tumour heterogeneity on disease progression. The purpose of this study is to construct a prognostic risk model for ccRCC by analysing the differential marker genes related to immune cells in the single-cell database to provide help in clinical diagnosis and targeted therapy. Single-cell data and ligand-receptor relationship pair data were downloaded from related publications, and ccRCC phenotype and expression profile data were downloaded from TCGA and CPTAC. Based on the DEGs of each cluster acquired from single-cell data, immune cell marker genes, and ligand-receptor gene data, we constructed a multilayer network. Then, the genes in the network and the genes in TCGA were used to construct the WGCNA network, which screened out prognosis-associated genes for subsequent analysis. Finally, a prognostic risk scoring model was obtained, and CPTAC data showed that the effectiveness of this model was good. A nomogram based on the predictive model for predicting the overall survival was established, and internal validation was performed well. Our findings suggest that the predictive model built and based on the immune cell scRNA-seq will enable us to judge the prognosis of patients with ccRCC and provide more accurate directions for basic relevant research and clinical practice.


Introduction
RCC is a typical type of malignant tumour of the urinary system. According to the most recent report on cancer statistics, the number of newly diagnosed cases has climbed to 65,000 annually in the United States, resulting in around 15,000 fatalities annually, making it the sixth most prevalent tumour [1]. Clear cell renal cell carcinoma (ccRCC) accounts for around 80% of renal cancer pathological types, and its survival results were poorer than other subtypes of kidney tumours (such as papillary renal cell carcinoma and chromophobe renal cell carcinoma) [2]. Nearly 20% of ccRCC cases progress to an advanced stage at the beginning of diagnosis, and the fve-year overall survival (OS) rate of metastatic cases dropped to about 10% [3]. With the development of immunotherapy, radiotherapy, and surgical intervention, combined strategies have greatly promoted carcinoma control. However, the actual clinical efcacy still needs to be improved, and 30% of patients with local ccRCC inevitably experience cancer-related progression and recurrence [4]. Recently, although targeted therapy has been shown to prolong the survival time of patients with metastases, the median survival time is still less than 3 years [5]. In addition, drug resistance and economic burden are unavoidable major problems in clinical practice [6]. Terefore, exploring the molecular mechanism of ccRCC pathogenesis and new therapeutic targets is still a challenging issue.
A crucial aspect of carcinoma is its comprehensive heterogeneity, which can cause individuals to react diferently to the same treatment. Despite many eforts to clarify tumour heterogeneity, so far, the understanding of it is still mainly limited to the level of tumour cells [7]. Previously, it has also been proved that stromal cells and tumour-infltrating immune cells exhibit heterogeneity [8]. Similarly, the tumour microenvironment (TME) is gradually regarded as a potential solution for drug treatment targets [9]. In addition to anti-PD-1/PD-L1 or anti-CTLA-4 treatment strategies, tumour-associated macrophages (TAMs) [10] and cancer-associated fbroblasts (CAFs) [11] have also been previously reported as potential strategies for cancer treatment research. ROS is also an important factor in cancer treatment, it causes structural proteins to oxidise, which disables the proteolytic process. Tese reactions change how an enzyme works or how proteins are created. Te latter could have a wide range of functional impacts downstream, including inhibition of binding and enzymatic activity, an increase or decrease in cellular uptake, inactivation of DNA repair enzymes, and a reduction in the fdelity of damaged DNA polymerases during DNA replication [12]. Te successful implementation of these treatment plans requires a deeper insight of intratumoural heterogeneity.
It is obviously impossible to analyse intratumoural heterogeneity at the cellular level since traditional bulk RNA sequencing is predicated on the idea that every gene is expressed equally in each cell. However, through the application of singlecell RNA sequencing (scRNA-seq), it is possible to conduct single-cell transcriptome analysis. Te latest progress in scRNAseq has facilitated the transcriptional classifcation of many malignant tumour cell types, including breast cancer, lung cancer, and pancreatic ductal adenocarcinoma [13,14]. Moreover, scRNA-seq is expected to have clinical utility in refractory cancer cases and is a noninvasive method for monitoring circulating cancer cells, analysing intratumoural heterogeneity, and sensitively estimating recurrent tumours [15].
We conducted a series of bioinformatics analyses using data from other publications about scRNA-seq in order to investigate the intratumour heterogeneity in ccRCC. We combined these analyses with ligand-receptor network analysis, immune cell multilayer network analysis, and weighted gene co-expression network analysis (WGCNA) to identify hub genes for creating an immune cell-related prognostic model. It would have several potential targets for ccRCC therapy. Moreover, we also investigated the prognostic value of immune cell clusters by correlating the scRNA-seq data with the data from Te Cancer Genome Atlas (TCGA) and Clinical Proteomic Tumor Analysis Consortium (CPTAC) datasets. Our work will help elucidate the biology of ccRCC and promote the improvement of clinical diagnosis and treatment strategies for patients with ccRCC.

Raw Data
Acquisition. ccRCC single-cell transcriptome data was downloaded from a database published by Young et al. [16]. Te datasets of RNA sequencing profles and the related patient clinical traits of ccRCC were downloaded from TCGA (https://portal.gdc.cancer.gov/) and CPTAC (https://cptac-data-portal.georgetown.edu/study-summary/ S050), separately. Ligand and receptor data for building the multilayer network were acquired from [17].

Data
Processing. For single-cell data, "limma," "Seurat," "dplyr," and "magrittr" R packages were used for analysis. Data fltering criteria included the following: (1) the number of genes in the data sample was controlled between 200 and 5,000; (2) the number of gene sequences was mainly distributed between 0 and 20,000; and (3) the percentage of mitochondria was controlled below 5%. Ten, the log was taken for standardisation, and the frst 2,000 genes with the larger coefcient of variation between cells were selected for analysis. Next, principal component analysis (PCA) dimensionality reduction was performed, the data were standardised before dimensionality reduction, and fnally, signifcant dimensions were selected for subsequent analysis. Since the form of data downloaded from TCGA is log2-(data + 1), log processing is not necessary and the standardisation was done directly. Before standardisation, the data must be processed using log2-(data + 1) after being retrieved from the CPTAC database. Te "limma" R package was used to carry out the standardisation.

Cell Type Recognition and Clustering Analysis.
Te recognition of diferent cell types was based on the "limma," "Seurat," "dplyr," and "magrittr" R packages. We used the 20 principal components (PCs) selected in the previous step to perform TSNE clustering. Subsequently, the cell type was annotated through the "singleR" R package. In order to facilitate the display of subsequent results, we have annotated both subpopulations and single cells.

Identifcation of Diferentially Expressed Genes in Each
Cluster. We used several R packages, including "limma," "Seurat," "dplyr," and "magrittr" to analyse the genes contained in each cluster. Te FindAllMarkers algorithm was used to screen and calculate the diferentially expressed genes (DEGs) in each cluster. Te screening standard is |logFC| > 0.5, and the P value after correction is <0.05.

Immune Cell Function Status Analysis.
We used "GSVA" and "GSEABase" R packages to conduct functional status analysis on samples annotated by single cell, and we referred to the marker genes of immune cell functional status provided by the CancerSEA (https://biocc.hrbmu.edu.cn/ CancerSEA/home.jsp) database to clarify the functional status of DEGs in immune cells.

Construction of Ligand-Receptor and Immune Cell
Multilayer Networks. Te construction of the ligand-receptor network was carried out using the "igraph" R package. To make sure that the ligand genes and associated receptor genes were all included in the gene set taken in union, we frst took the intersection of the genes in the ligand-receptor table provided in the literature [17] and the diferential genes in all immune cell clusters and the marker genes of all included immune cells. Ten, we obtained the data for transcription factors and their target genes from the TRRUST (https://www.grnpedia.org/ trrust/) database and combined it with the data for ligandreceptor network genes, which is the intersection of the transcription factors' target genes and network genes.

Weighted Gene Co-Expression Network Analysis.
Trough the WGCNA algorithm [19], the genes in the immune cell multifactor network were used to construct a co-expression network to fnd the key modules related to OS and OS time. An appropriate soft threshold value was determined by an R software package (https://www.r-project.org/) to implement according to the WGCNA algorithm. Te gradient method was used to test diferent power values (ranging from 1 to 20) in both the scale independence degree and the module's average connectivity. Te most suitable power value could be fxed when the independence degree was above 0.9, as well as when the average connectivity degree was relatively higher [20,21]. Te WGCNA algorithm was also implemented in the construction of scale-free gene co-expression networks. Meanwhile, the corresponding gene sequencing information in each module was extracted. To assess modular feature associations, correlations between module eigengenes (MEs) and clinical features were applied. MEs are the most important components in the PCA of each gene module. Te determination of relevant modules needs to be based on the calculation of the correlation strength between MEs and clinical features. Te correlation was assessed by gene signifcance (GS � lgP), where the P value was derived from the linear regression analysis of gene expression and clinical information. Te key module takes the highest correlation coefcient among all modules, which was selected out for the next step [22].

Key Module Functional Enrichment Analysis.
Te sequencing information of genes in the key modules from WGCNA was utilized by using the "clusterProfler" R package to perform gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analyses. Among them, GO is to annotate biological processes (BPs), molecular functions (MFs), and cellular components (CCs). Te criterion for screening in GO term is P value <0.05. Te screening criteria for the KEGG pathway are minGSSize � 5, maxGSSize � 500, and qvalueCutof � 0.05.

Selection of Candidate Prognostic Related Genes.
Univariate Cox regression analysis was performed through the "survival" R package to screen prognostic characteristic genes from the previous OS-related WGCNA key modules. When the P value is less than 0.05, that is, when the diferential expression of these genes has a signifcant impact on the patient's OS, these genes can be regarded as potential prognostic related genes. Data in this step were from ccRCC cancer samples in TCGA.

Construction, Evaluation, and Validation of Disease
Prognosis Risk Model. For the candidate prognostic related genes, combined with their expression in TCGA, univariate Cox regression analysis was used to obtain genes with more signifcant risk. Ten, LASSO dimensionality reduction with 1,000 iterations was performed to screen out redundant genes to obtain more precise prognostic related genes with high hazard ratio (HR) to construct a risk prognosis model. Te following formula was used to calculate the risk score for each patient by using a linear combination of specifc features that were weighted by their respective coefcients from LASSO: where n is the number of prognostic genes, exp i is the expression value of the i-th gene, and ß i is the regression coefcient of the i-th gene in the LASSO algorithm.
According to the risk score of each patient given by the model, the median was taken as the cutof value, and the samples were divided into high and low risk groups. Te time-dependent receiver operating characteristic (ROC) curve was used to evaluate the predictive ability of the model's 1-, 3-, and 5-year survival periods. Te survival curves of the high and low risk groups were also analysed. Te CPTAC dataset was taken as the external validation database to verify the prognostic risk model.

Construction and Assessment of a Predictive Nomogram.
Nomograms are widely used to predict the prognosis of cancer patients, mainly because they can simplify statistical prediction models into a single numerical estimate of OS probability tailored to individual patient conditions. In this study, the prognostic model was used to construct a nomogram to assess the probability of OS in patients with ccRCC at one, three, and fve years. Subsequently, discrimination and calibration were carried out. Te discrimination of the nomogram was calculated by the bootstrap method using the consistency index (Cindex), with 1,000 resamples. Te value of the C-index is between 0.5 and 1.0, where 1.0 means that the results of the model can be correctly distinguished and 0.5 means random chance. Te calibration curve of the nomogram is evaluated graphically by plotting the relationship between the predicted probability of the nomogram and the observed rate. Overlapping with the reference line indicates that the model is exactly the same. In addition, we also compared the predictive accuracy between nomogram built only with risk score and nomogram combined with all factors using ROC analysis.
section, and we obtained 30,092 cells in total. In addition, we found that the correlation between the sequencing depth and the detected genes was 0.95, indicating that the deeper the sequencing depth was, the more the genes were detected. Subsequently, we selected 2,000 genes with large variances for PCA analysis. Te diferences of all 20 PCs were extremely signifcant, indicating that the theoretical value and the actual value are quite diferent which can be used for subsequent analysis.
Tere were 607 samples in the KIRC expression profle data of TGCA, of which 72 were paracancerous samples, 535 were cancer samples, and one sample had incomplete clinical information which was then removed. Finally, the data of 534 ccRCC tumour tissue samples were used for subsequent relative analysis. Among the data downloaded from the CPTAC database, there were 110 cancer samples and 75 paracancerous samples. Only 103 cancer samples contained clinical information. As a result, the data of these 103 samples were eventually used for analysis.

Diferentially Expressed Genes and Functional Enrichment in Diferent Immune Cell Subgroups.
We performed diferential expression analysis on the genes in 23 clusters obtained in the above step and displayed the frst fve genes in each cluster (Figure 2(a), Supplementary Table 2). According to the results of gene diferential expression, we analysed the functional status of the annotated immune cell clusters. In each immune cell type, the enrichment degree of hypoxia and quiescence was relatively high. Besides the enrichment levels of EMT, invasion and stemness in B cells were also relatively high (Figures 2(b)-2(f )).

Identifcation of Immune Cell Marker Gene Expression.
A total of 42 immune cell marker genes related to ccRCC were downloaded from the CellMarker database [18] and subjected to diferential expression analysis. Te results are shown in the heat map (Figure 2(g)).

Ligand-Receptor Network.
In order to construct the ligand-receptor network, we frst took the union of the diferential genes of all immune cell clusters and the marker genes of all these immune cells. Afterwards, we intersected them with the ligand-receptor relationship pairs downloaded from [17]. Finally, a total of 981 pairs of ligandreceptor relationships were obtained (Figure 3(a), Supplementary Table 3).

Immune Cell Multifactor Network Based on Ligand-Receptor Network Combined with Transcription Factors.
Intersecting genes in 981 ligand-receptor relationship pairs with transcription factor target genes, we obtained 7,987 immune cell multifactor network relationship pairs (Supplementary Table 4). Ten, 966 genes were obtained by intersecting the 973 genes contained in the network and the genes in TCGA dataset about ccRCC (Supplementary Table 5). Because there are many relationship pairs, Figure 3(b) only shows a network diagram of partial genes.

Co-Expression Network.
Te construction of coexpression modules included 966 genes from the immune cell multifactor network. Te appropriate scale-free power value was screened out as 4 (Figure 4(a)). All constructed modules are painted with diferent colours, and the cluster trees of genes are also shown in Figure 4(b). Te black and magenta modules were chosen as the key modules, since they had the highest correlations with OS and OS time of ccRCC (Figures 4(c) and 4(d)). Te correlations between MEs and clinic traits are shown in Figure 4(e). Tere were 53 genes in these two modules (Supplementary Table 6). For a deeper understanding about the biofunctions of these modules, genes in these modules were carried out to conduct GO and KEGG pathway analyses by using the "cluster-Profler" R package. According to the P value of each term, the top terms in the GO and KEGG pathways were extracted out and visualized (Figures 4(f )-4(i)).

Prognostic Risk Scoring
Model. Using the "survival" R package to perform univariate Cox regression analysis on the 53 genes contained in the key modules of WGCNA, 28 genes with P value <0.05 were obtained. Figures 5(a)-5(d) show the survival analysis results of four genes among them. Ten, the 28 genes with signifcant prognostic diferences were subjected to LASSO regression analysis. We adopted the Cox proportional hazard model (family = "Cox") to calculate the HR and P values of these genes ( Figure 5(e)) and then randomly simulated 1,000 times (maxit = 1000) to fnd the most suitable λ value in LASSO regression ( Figure 5(f )). Finally, "lambda.min" was used to screen out 16 genes for constructing a risk scoring model from these 28 genes ( Figure 5(g)).

Prognostic Model Prediction Efectiveness Evaluation and External Dataset Verifcation.
In the evaluation of the predictive efcacy of the prognosis model, Kaplan-Meier (KM) survival analysis was performed on the high and low risk groups, and the diference was signifcant ( Figure 5(h)). Moreover, in its ROC curve, the one-year AUC value was 0.794, the three-year AUC value was 0.746, and the fve-year AUC value was 0.763 ( Figure 5(j)). In the external CPTAC dataset, KM survival analysis was performed on the high and low risk groups, and the diference was also signifcant ( Figure 5(i)). In addition, the one-year AUC value in its ROC curve was 0.783, and the three-year AUC value was 0.762 ( Figure 5(k)). Because the external data do not have fve-year survival data, only one-year and three-year ROC analysis was performed.

Predictive Nomogram.
For the purpose of building a clinically applicable method to estimate the survival possibility of patients with ccRCC, we developed a nomogram to predict the probability of 1-, 3-, and 5-year OS based on the data in TCGA. Te predictors of the nomogram included age, gender, T, N, M, grade, risk score, and stage ( Figure 6(a)). Te C-index for the model for evaluation of OS was 0.799. Te visual calibration chart was used to evaluate the performance of the nomogram. When the angle of the line is 45°, it represents the best prediction result. Tus, our calibration chart indicated that the nomogram has a good predictive ability (Figures 6(b)-6(d)). Te AUC values of the nomograms combined with all factors were greater than the nomograms built only with risk score in spite of the fact that their values were all more than 0.7. Tis indicated that the  0  1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21

Discussion
Te emergence of next-generation sequencing (NGS) has provided a feasible and cost-efective way to determine the transcriptional landscape of human cancers, including both bulk and single-cell resolution with unprecedented base-pair precision [23][24][25]. It has been established that cancer is attributed to dysregulated evolution [26,27] in acquiring inheritable genetic/epigenetic traits [28][29][30]. However, the presence of tumour heterogeneity poses substantial challenges in the diagnosis and treatment of tumours [31][32][33][34]. Tumour heterogeneity exerts a vital role in various aspects (e.g., intertumour, intratumour, and intermetastasis heterogeneity, interdisease and interpatient heterogeneity, epigenetic and metabolic heterogeneity, TME heterogeneity, and tumour-intrinsic genetic heterogeneity)    Journal of Oncology [35,36,[36][37][38]. A landmark paper has demonstrated that ccRCC is a heterogeneous disease with marked genetic intermetastases and intratumour heterogeneity (G-IMH and G-ITH) [39]. Further studies have elucidated whether somatic mutation landscape and genetic heterogeneity infuence the clinical outcomes of ccRCC tumour management [40]. Because of this, we adopted a series of bioinformatics methods to use the ccRCC single-cell data in published articles and the ccRCCrelated data in public databases to study whether immune cell-related genes can construct a predictive prognostic model for patients with ccRCC, which may be helpful for further understanding of the intratumour heterogeneity of ccRCC, and provide corresponding support for related basic research and clinical applications in the future.
Since there are many genes used to annotate a certain cell, it is usually difcult to determine which of these genes are critical. As a result, we built some networks, hoping to better fnd key genes related to our target clinical traits to construct a risk prediction model. Researchers have traditionally been concerned with a few or linear pathways between diferent cells. Identifying the signalling network of communication within diferent cell types is invaluable in the diagnosis and treatment of ccRCC tumours. Furthermore, a complete network of cell-cell signalling includes not only intercellular signalling pathways but also intracellular signalling transduction and gene expression [41]. Tus, a complete network of molecular signalling mechanisms is required to show the interaction between the TME and related cell types. A study has proved a potential scRNAseqtranscriptome-based multilayer network approach, which can be considered as a useful technique to identify the interplay between the TME and tumour cells, further predicting the prognostic and predictive signatures of    cancer patients [17]. In addition to the multilayer network, we also applied the WGCNA algorithm to explore the hub genes in key modules associated with the OS and OS time. Te WGCNA algorithm explores the relationship between co-expression modules and clinical traits, which provides an opportunity to identify the hub genes in a module but not a downstream gene; thus, it possesses the distinct advantage of making the results more reliable and higher in biological signifcance [42]. In our study, we found all the genes related to the immune cells in the ccRCC samples through multilayer networks, then divided these genes into multiple modules using the WGCNA method, and used the genes in the modules with the strongest correlation with OS and OS time as the candidate genes for risk scoring model construction. Trough the survival diference analysis of the genes in the key modules of the WGCNA algorithm, the genes with signifcant prognostic signifcance were found and the genes used to construct the risk prediction model were confrmed after LASSO dimensionality reduction processing. Subsequently, we verifed the feasibility and efectiveness of the model for assessing the prognosis of patients with ccRCC through nomogram, which also showed that the immune cells in ccRCC do have an impact on the prognosis of patients.
Some genes in the prognostic risk scoring model have been proven to exert various efects on the regulation of certain tumours or diseases. Previous study has found that an imbalance of APP may be involved with the negative correlation between cancer and Alzheimer's disease [43]. As a vital target in the TLR signalling pathway, CD14 exerts a dual efect on oncogenesis, which can initiate several tumour-related signalling pathways or alter the immune microenvironment in the tumour [44]. COL1A1 was considered to be associated with the pathogenesis of COL1A1-PDGFB fusion uterine sarcoma [45]. It was reported that DDR1 is involved in the development of cancer and fbrotic diseases [46]. Regulating E2F5 is of great signifcance in maintaining genome stability and the cell cycle [47]. Study has shown that if certain signal mutations cause the destruction of FGF1, it is likely to give rise to cancer [48]. Te dysregulation of HDAC1, a chromatin modifer, may cause harmful efects on cell fate and function, which could lead to cancer [49]. HIPK2, a multitalented protein, utilizes its kinase activity to regulate several pathways to limit the proliferation and diferentiation of tumour cells and induce positive responses to therapies [50]. Since they are susceptible to ROS, proteins are typically the target of increased free radical production. ROS lead to the oxidation of structural proteins, which shuts down the proteolytic mechanism. Tese reactions alter the way proteins are built or how an enzyme functions. Te latter could have many diferent downstream functional effects, such as inhibition of enzymatic and binding activities, an increase or decrease in cellular absorption, inactivation of DNA repair enzymes, and a decrease in the fdelity of damaged DNA polymerases in DNA replication [12]. HOXA9, a homeodomain-containing transcription factor, exerts a vital role in the proliferation of hematopoietic stem cells and is commonly negatively afected in acute leukaemias [51]. Recent study has shown that ITGA6 can be a useful biomarker for early detection of colorectal cancer cells in a noninvasive assay and as a prognostic factor [52]. L1CAM has been found in various types of human cancers, which indicates a bad prognosis [53]. NFKBIZ is a psoriasis susceptibility gene that encodes the functions of TRAF6 signalling players, especially in terms of positive and regulatory immune controls by interactions between immune cells and epithelial cells [54]. Oncogenic gene fusion involving NRG1 contributes to the activation of ErbB-mediated pathways, representing a potential target for tumour management [55]. PAX2 has been found in epithelial tumours of the kidney and the female genital tract [56]. TEAD4 binds with YAP, TAZ, VGLL, and other transcription factors to modulate various tumour-related processes, including tumour cell proliferation, cell survival, tissue regeneration, and stem cell maintenance, in cancer via its transcriptional output [57]. Te abovereported functions and mechanisms of these genes could help elucidate their correlations with ccRCC and provide reliable evidence for further determination of diagnostic and therapeutic methods.
Although our study only used published data and information in public databases and did not directly use clinical samples for experimental testing and analysis, it is still sufcient to show that the data obtained through single-cell sequencing is able to provide an efective support to predict the prognosis of patients with ccRCC. Additionally, our research can also provide ideas for clinical diagnosis and treatment. For example, the genes in the risk prediction model we have established are more likely to become marker genes for clinical screening of ccRCC or therapeutic targets for metastatic ccRCC. Furthermore, our methods and results would enhance theoretical assistance for other researchers to explore other cancers related to tumour heterogeneity in the future.

Conclusion
Cancer has been proven to be caused by dysregulated evolution [27] that results in the acquisition of heritable genetic or epigenetic characteristics. However, the occurrence of tumour heterogeneity creates signifcant difculties for both tumour identifcation and treatment. ccRCC is a heterogeneous disease with marked genetic intermetastases and intratumour heterogeneity (G-IMH and G-ITH). Te purpose of this study is to determine whether immune cell-related genes can be used to build a predictive prognostic model for patients with ccRCC.
In our study, we used multilayer networks to identify all the immune cell-related genes in the ccRCC samples. We then used the WGCNA method to separate these genes into various modules, and we used the genes in the modules with the strongest correlation with OS and OS time as the candidate genes for risk scoring model construction. Following all steps as detailed in result and discussion section, we then used a nomogram to validate the viability and efcacy of the model for determining a patient's prognosis for ccRCC, which also demonstrated that the immune cells in ccRCC do afect the prognosis of patients.
In a nutshell, our results indicate that the immune cell scRNA-seq predictive model will help us to assess the prognosis of patients with ccRCC and provide more precise guidelines for basic related research and clinical management. As a result, it may help to further our understanding of the intratumour heterogeneity of ccRCC and support future basic research and clinical applications.