PIWI-Interacting RNA Pathway Genes: Potential Biomarkers for Clear Cell Renal Cell Carcinoma

Background Clear cell renal cell carcinoma (ccRCC) is one of the most lethal malignancies in the urinary system, yet effective diagnostic and prognostic markers are lacking. Recently, several of piRNA pathway genes have been reported to be associated with cancer diagnosis and prognosis, but their role in ccRCC is still unclear. Methods We analysed the expression of 27 piRNA pathway genes in 539 kidney renal clear cell carcinoma (KIRC) and 72 nontumor tissue samples (data from TCGA), and 12 mRNAs were significantly different. The aim was to sift the piRNA pathway genes that are correlated with ccRCC patient survival and to construct a piRNA pathway gene risk prognostic model using Kaplan-Meier survival curve and ROC curve, respectively. Results 5 piRNA pathway genes (TDRD7, GPAT2, PLD6, SUV39H1, and DOM3Z) were picked out and used to construct the piRNA pathway gene risk model. Kaplan-Meier survival curve analysis showed that compared with that of the low-risk group of ccRCC patients, the OS of the high-risk group of ccRCC patients was significantly reduced. The predictive performance of the prognostic risk model was measured using a ROC curve, which individually showed AUC values for 1 year of 0.707, for 3 years of 0.713, and for 5 years of 0.701. Moreover, the mRNA and protein expression levels of TDRD7 were overexpressed in the ccRCC datasets (data from our cohort, TCGA, GEO, and CPTAC) and ccRCC cell lines, and the expression levels correlated with the clinicopathological characteristics in ccRCC. The Tumor Immune Estimation Resource (TIMER) showed that the mRNA expression level of TDRD7 was positively related to tumor immune infiltrating cells (TICs) in ccRCC. Mechanistically, gene set enrichment analysis (GSEA) was performed to uncover the mechanism of TDRD7 in ccRCC. In summary, the piRNA pathway genes,especially TDRD7, may be potential cancer diagnostic and prognostic biomarkers of ccRCC.


Introduction
Renal cell carcinoma (RCC) is one of the most common tumors of the urinary system. In 2020, approximately 73,750 new cases and 14,8308 deaths occurred in the USA [1]. RCC is composed of various histological and molecular subtypes, of which clear cell renal cell carcinoma (ccRCC) is the most common, accounting for approximately 70-80% [2]. At present, for the diagnosis of RCC, the most frequently used technique is computed tomography (CT). Due to its resistance to radiotherapy and conventional chemotherapy, surgery is the most effective treatment for localized RCC [3]. Unfortunately, approximately 30% of patients have distant metastases at the time of diagnosis [4]. These patients lose their best treatment and have a worse prognosis. Therefore, there is a need for an urgent solution to find new diagnostic and therapeutic targets to reduce the mortality rate of ccRCC.
PIWI-interacting RNAs (piRNAs) are small noncoding RNAs (ncRNAs) of 24-32 nucleotides in length that specifically interact with PIWI proteins in the Argonaute protein family [5]. piRNAs exist in germ cells and somatic cells and have crucial functions for instance inhibition of transposable elements (TEs) and epigenetic regulation of gene expression. piRNA formation involves three pathways: primary processing, secondary processing (ping-pong cycle), and TE transcriptional silencing [6]. The primary processing of piRNA can occur in germline cells and their surrounding somatic cells, but the ping-pong cycle can only occur in germline cells [7,8]. piRNAs are processed in a conserved perinuclear structure called the nuage, which contains piRNA pathway proteins, including the Piwi branch of the Argonaute family of proteins, as well as some Tudor domain proteins, RNA helicases, and nucleases. Tudor domain containing 7 (TDRD7) is a Tudor domain protein that is mainly expressed in germline cells and colocalizes with other piRNA pathway components [9]. Its function in the piRNA pathway is to regulate inverted transposons in Drosophila germlines [10].
With the development of next-generation sequencing, piRNAs are found to play important roles in RCC. For example, Iliev et al. found that the higher the level of piR-823 in tumor tissue, the shorter the disease-free survival, and the higher the level of piR-823 in serum, the higher the clinical stage of RCC. piR-823 shows potential for early diagnosis of tumors [11]. Zhao et al. measured the expression of two kinds of mitochondrial piRNAs (piR-34536 and piR-51810) in the tissues and serum of ccRCC patients [12]. The results showed that the expression levels of piR-34536 and piR-51810 in ccRCC tissues were significantly lower than those in normal tissues, and the level of mitochondrial piRNA was negatively correlated with the prognosis of ccRCC patients. However, the function of piRNA pathway genes in ccRCC remains unclear.
In this study, we explored the diagnostic and prognostic value of piRNA pathway genes in ccRCC tumors via constructing a piRNA pathway gene risk prognostic model (using data derived from TCGA). We found that the piRNA pathway gene risk model could be an independent prognostic factor in ccRCC. Then, we found that the piRNA pathway gene TDRD7 was overexpressed in ccRCC. High TDRD7 expression was relevant to tumor progression and immune infiltrating cells in patients with ccRCC. Our data revealed that TDRD7 may provide a reliable biomarker for the diagnosis and prognosis of ccRCC.

Materials and Methods
2.1. The Cancer Genome Atlas (TCGA). TCGA (https:// genomecancer.ucsc.edu/), which is a large, free tumor data portal of the human genome project, contains the RNA   Type  TDRD6   ASZ1   TDRD1   PLD6  MOV10L1   FKBP6  NHLH1  TDRD12  GPAM  GPAT2  PIWIL1  PIWIL2   PIWIL4   PIWIL3   DDX4  TDRD9  TDRKH  TDRD5  TDRD7  RNF17  MALE  GTSF1  CBX5         Disease Markers from TCGA). We calculated the risk score for each patient using the regression coefficients of individual piRNA pathway genes and the expression values of each selected piRNA pathway gene obtained from the multivariate Cox risk model.
Risk scores were calculated from a linear combination of the relative gene expression levels multiplied by the regression coefficients. The regression coefficients were obtained by multiple Cox analyses and represented the relative weights of genes.
2.5. TIMER Database Analysis. TIMER (https://cistrome .shinyapps.io/timer/) is an online dataset used to assess clinical associations, mutations, and the relationships between somatic copy number alterations (SCNAs) and the invasion of seven immune cells in different cancer types. The Spearman correlation test was used to analyse the correlation between TDRD7 expression and the level of immune cell infiltration.
2.6. Survival Analysis and Receiver Operator Characteristic (ROC) Curve Analysis of piRNA Pathway Genes. From the online database Kaplan-Meier Plotter (http://kmplot.com/ analysis/), overall survival (OS) data were downloaded, where the cut-off value was the median. To verify the diagnostic value, the area under the ROC curve (AUC) was calculated by GraphPad Prism 8 (survival data from the TCGA database) to evaluate the diagnostic value. All P values < 0.05 were considered statistically significant.
2.7. Gene Set Enrichment Analysis (GSEA). According to the median expression level of TDRD7, the samples in the ccRCC patient datasets were separated into high and low expression groups, and GSEA (http://www.gsea-msigdb .org/gsea/index.jsp) was used to examine whether genes that were enriched in both groups were involved in life processes in a meaningful way. FDR ðvalueÞ < 0:25 and P < 0:05 were considered statistically significant.
2.10. Statistical Analysis. All data were analysed using SPSS software (Version 26.0; IBM, Armonk, NY, USA). First, the normality and homogeneity of variance were tested. When normality and homogeneity of variance were met, comparisons between multiple groups of data were performed by ANOVA, and two-group comparisons were performed using Student's t-test. The results are expressed as the mean ± standard deviation (mean s ± SD), and P < 0:05 indicates that the difference is statistically significant.

Construction of a Prognostic Model of piRNA Pathway
Genes. To build a prognostic risk model, univariate regression analysis was performed to identify six piRNA pathway genes (Figure 2(a)). The LASSO Cox regression model was used to escape excessive model fitting through multivariate Cox regression analysis (Figures 2(b) and 2(c)). The risk scores are shown below:    Disease Markers

Evaluation of the piRNA Pathway Gene Prognostic Risk
Model. We first analysed 266 patients separated into highand low-risk groups (based on the median risk score) in the training group. Kaplan-Meier survival curve analysis showed that compared with that of the low-risk group of ccRCC patients, the OS of the high-risk group of ccRCC patients was significantly reduced (Figure 3(a)). The predictive performance of the prognostic risk model was measured using a ROC curve, which individually showed AUC values for 1 year of 0.707, for 3 years of 0.713, and for 5 years of 0.701 (Figure 3(b)). We then analysed their distribution by ranking the risk scores of patients for OS (Figure 3(c)). The dot plots show the OS status of a single patient with ccRCC ( Figure 3(d)). The heat maps show the expression levels of the risk genes in the training group (Figure 3(e)). These results demonstrate the moderate performance of the prognostic prediction model based on piRNA pathway gene characteristics. Then, we used the same methods in the test group and evaluated the predictive performance of the prognostic risk model. This result is similar to the above result ( Figure 4). These results prove the stable performance of the prognostic prediction model based on piRNA pathway genes.

The Prognostic
Model Based on the piRNA Pathway Was an Independent Risk Factor for OS in ccRCC Patients. The TCGA ccRCC dataset was used to verify the features we identified, and miscellaneous clinical parameters were used as independent prognostic factors for ccRCC by univariate and multivariate Cox regression analysis. Univariate analysis showed that the piRNA pathway gene risk score, stage, T classification, and metastasis classification were markedly correlated with OS (all P < 0:05). Moreover, multivariate analysis indicated that the piRNA pathway gene risk score and metastasis classification were significantly related to OS (all P < 0:05). Hence, the piRNA pathway genes could be an independent prognostic factor for patients with ccRCC ( Figure 5, Table S1).  (Figure 6(i)) could effectively differentiate ccRCC patients. However, the expression of DOM3Z was not significantly associated with OS ( Figure 6(j)). Overall, the results indicated that the expression of the five piRNA pathway genes was markedly associated with the prognosis and diagnosis of ccRCC patients and could be useful as a biomarker for the prognosis of ccRCC patients and as diagnostic targets for ccRCC. According to previous results, we believe that TDRD7 is a key gene of the piRNA pathway in ccRCC. Thus, we chose TDRD7 for further study.

Survival and ROC Curve
3.6. Expression Levels of TDRD7 in ccRCC Patients. Transcriptome sequencing data from the TCGA dataset were used to assess the expression level of TDRD7 in ccRCC. TCGA results revealed that the TDRD7 expression level was upregulated in ccRCC tissues (Figure 7(a)). We used microarray data from GEO (GSE66270, GSE53747, GSE66270, and GSE68417) to further confirm the expression of TDRD7 in ccRCC. The results showed that TDRD7 was also highly expressed in ccRCC tissues

10
Disease Markers (Figures 7(b)-7(e)). Moreover, similar results were observed in our ccRCC patient cohort transcriptomesequencing data (Figure 7(f)). In addition, compared with that of normal tissues, the protein expression level of TDRD7 in ccRCC tissues was significantly higher (data from the CPTAC) (Figure 7(g)). The TDRD7 immunohistochemistry (IHC) results were obtained from the Human Protein Atlas and were in line with the results obtained from the CPTAC (Figure 7(h)). Moreover, qRT-PCR detected that all of the ccRCC cell lines exhibited high TDRD7 expression levels in contrast to that of the normal kidney epithelial cell line (HK-2) (Figure 7(i)).

Clinical Features Related to TDRD7 Expression in ccRCC.
To further identify the underlying role of TDRD7 based on clinical data, we reviewed the clinical data of the TCGA ccRCC patients. Clinical pathological parameters included the individual cancer stages, TNM classification, grade, age, and sex of patients. The expression of TDRD7 was markedly upregulated in ccRCC classified as stages I-IV (Figure 8(a)), grades 1-4 ( Figure 8(b)), or T classification 1-4 compared with that of normal renal tissues (Figure 8(c)). Moreover, low expression of TDRD7 mRNA was markedly correlated with metastasis ( Figure 8(d)) and sex (Figure 8(e)). Neither the correlation between TDRD7 expression and age (Figure 8(f)) nor that between TDRD7 expression and N classification (Figure 8(g)) in patients with ccRCC was significant. The N classification result may be caused by the small sample size (only 16 patients were lymph node-positive). We further studied the correlation between the TDRD7 protein expression level of ccRCC patients and the abovementioned clinicopathological The TDRD7 mRNA expression level for the patient characteristics of (a) stage, (b) grade, (c) T classification, (d) metastasis classification, (e) gender, (f) age, and (g) lymph node classification. The TDRD7 protein expression level for the patient characteristics of (h) pathologic stage and (i) grade. The data are presented as the means ± SD. * P < 0:05; * * * P < 0:001; NS: no significance. 11 Disease Markers characteristics. The results showed that the increase in TDRD7 protein expression was related to tumor stage and tumor grade (Figures 8(h) and 8(i)).

Discussion
Currently, with the improvement in ccRCC diagnostic technology, the diagnostic rate of ccRCC has been greatly advanced. However, a large number of patients still fail to obtain the best treatment due to a lack of early clinical symptoms and sensitive biomarkers. According to reports, a host of piRNAs play an important role in ccRCC [11,14,15]. piRNA pathway genes are essential for the formation of piR-NAs. However, few studies have focused on the mechanism of piRNA pathway genes and their roles in ccRCC. Therefore, we explored which piRNA pathway genes can be used as novel effective prognostic biomarkers for patients with ccRCC. In this study, the expression of twenty-seven piRNA pathway genes was analysed in ccRCC using the TCGA database, and then, five of the genes were used to build a prognostic risk model. We then selected TDRD7 for further verification. TDRD7 is a member of the Tudor domain RNA binding (TDRD) protein family. The TDRD protein family, named because it contains one or more Tudor domains, is an evolutionarily conserved family of methylated arginine binding proteins [16,17]. The TDRD protein family is involved in the formation of the nuage, the piRNA pathway in the gametes, and the occurrence of cancer [18][19][20][21][22][23]. TDRD7 contains three helix-turn-helix (HTH) 13 Disease Markers domains and two Tudor domains, is widely expressed in the germline and testis, and is associated with reverse transposon inhibition of long interspersed nuclear elements-I (LINE-I) [10]. In our study, by differential gene expression analysis, we found that TDRD7 is highly expressed in ccRCC samples compared with normal kidney specimens. However, when compared the expression of TDRD7 in ccRCC samples with different grades and stages, we noticed that TDRD7 is upregulated in lower malignancy or later stages of ccRCC. Moreover, we found that ccRCC patients with lower TDRD7 expression had a poorer prognosis. These controversial findings obfuscate the role of CHAC1 in the initiation or progression of KIRC [24]. Relevant literature has confirmed that tumor immune infiltration cells are associated with the prognosis of ccRCC patients [25,26]. The TIMER database showed that the expression level of TDRD7 in ccRCC was positively correlated with the expression levels of tumor immune infiltration cells. This result indicated that TDRD7 may be involved in the immune response of the ccRCC tumor microenvironment. Survival analysis also found that ccRCC patients with low expression of CD8 + T cells and macrophages had a worse prognosis. Above, we mentioned that TDRD7 is upregulated in ccRCC tissues, but its high expression is related to a good prognosis in ccRCC. We hypothesized that the overexpression of TDRD7 promotes the infiltration of CD8 + T cells and macrophages in ccRCC and ultimately slows tumor progression.
To further confirm the potential mechanism of TDRD7 in ccRCC, GSEA was carried out and showed that the high expression of TDRD7 was related to the PI3K/Akt/mTOR signalling, mitotic spindle signalling, and TGF-β signalling pathways. Previous studies have demonstrated that TGF-β induced epithelial to mesenchymal transition (EMT) in hepatocellular carcinoma (HCC) through the PI3K/Akt/ mTOR signalling pathways [27]. Hence, these results have shown that TDRD7 may regulate the oncogenesis of ccRCC via the TGF-β/PI3K/Akt/mTOR signalling pathways. Of course, this study also had several limitations. We only predicted that TDRD7 could be used as a predictor and diagnostic marker through the database, while functional experiments of TDRD7 were lacking, so there was no further verification of our hypothesis in ccRCC.

Conclusions
In summary, we first found that piRNA pathway genes have different expression levels in ccRCC. Then, we determined that the piRNA pathway gene risk model could be an independent prognostic factor in ccRCC. Furthermore, we chose the piRNA pathway gene TDRD7 for further study. Our results indicated that decreased expression of TDRD7 may be useful in predicting the poor prognosis of patients with ccRCC and may inhibit tumor immune cell infiltration in ccRCC. Moreover, this study revealed that TDRD7 could become a potential diagnostic and prognostic target for ccRCC.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.