TDG Gene Polymorphisms and Their Possible Association with Colorectal Cancer: A Case Control Study

Genetic alterations that might lead to colorectal cancer involve essential genes including those involved in DNA repair, inclusive of base excision repair (BER). Thymine DNA glycosylase (TDG) is one of the most well characterized BER genes that catalyzes the removal of thymine moieties from G/T mismatches and is also involved in many cellular functions, such as the regulation of gene expression, transcriptional coactivation, and the control of epigenetic DNA modification. Mutation of the TDG gene is implicated in carcinogenesis. In the present study, we aimed to investigate the association between TDG gene polymorphisms and their involvement in colon cancer susceptibility. One hundred blood samples were obtained from colorectal cancer patients and healthy controls for the genotyping of seven SNPs in the TDG gene. DNA was extracted from the blood, and the polymorphic sites (SNPs) rs4135113, rs4135050, rs4135066, rs3751209, rs1866074, and rs1882018 were investigated using TaqMan genotyping. One of the six TDG SNPs was associated with an increased risk of colon cancer. The AA genotype of the TDG SNP rs4135113 increased the risk of colon cancer development by more than 3.6-fold, whereas the minor allele A increased the risk by 1.6-fold. It also showed a 5-fold higher risk in patients over the age of 57. SNP rs1866074 showed a significant protective association in CRC patients. The GA genotype of TDG rs3751209 was associated with a decreased risk in males. There is a significant relationship between TDG gene function and colorectal cancer progression.


Introduction
The development of cancer is a multistep process involving aberrations in many cellular processes, including differentiation, cell cycle regulation, cell death, proliferation, and genomic conservation due to functional alterations in a variety of genes. Thymine DNA glycosylase (TDG) is a member of the mismatch uracil glycosylase subfamily. All of these uracil DNA glycosylase (UDG) enzymes have a monofunctional approach of action [1]. UDGs recruit a common base-flipping, DNA intercalation method for substrate identification and catalyze the removal of the N-glycosidic bond of the flipped base, thus creating an abasic site [2]. TDG has a crucial role in DNA repair, particularly BER, in which it specifically identifies G: U and G: T mismatches resulting from the impulsive deamination of 5-methylcytosine. In addition to its DNA repair function, TDG is also involved in other critical cellular processes, such as the regulation of gene expression, transcriptional coactivation, and the regulation of epigenetic DNA modification [3]. TDG has been shown to interact with some transcription factors and especially with nuclear receptors. TDG initiates the BER pathway, which utilizes the base-flipping method to delete the target bases from the DNA forming an AP site. This happens when TDG binds to the promoters of the BER proteins APE, DNA ligase, and Pol [4]. The role of TDG in cancer progression is a hotly debated issue [5]. Its interaction with tumor suppressor P53 (TP53) proteins initially suggested that TDG merely acts as a tumor suppressor. Overexpression of TDG recruits TP53 proteins to the cyclin dependent kinase inhibitor 1A (p21Waf1) gene promoter and increases its transcriptional activity [6]. Moreover, TP53 binding to the TDG promoter will transcriptionally regulate its expression and control the nuclear translocation of TDG [7]. The relationship between TDG and cancer has been studied by a number of research groups who have suggested that genetic variants in TDG and other DNA repair genes confer susceptibility to colorectal cancer [8]. Xu and colleagues showed that TDG positively regulates the Wnt signaling pathway and is a key driver necessary for the progression of CRC [9]. They also reported that hypermethylation of TDG in multiple myeloma cell lines reduced its gene expression. As a result, DNA repair activity became less efficient [10] in pancreatic adenocarcinoma [11]. Finally, a lack of the DNA mismatch repair protein PMS2 (PMS2) and reduced TDG expression in rectal cancer has been found to produce a supermutator phenotype at CpG sites [12].
Recent studies reported that the SNP rs2888805 (Val367Met) in TDG might be implicated in nonmelanoma skin cancer [13]. The TDG SNPs rs167715 and rs4135087 might also be associated with the progression of ovarian cancer in most of the BRCA1/2 mutation carriers [14]. The coding region SNP rs369649741 (Arg66Gly) has been reported to be associated with a high risk in familial colorectal cancer patients [8]. Significant associations have been demonstrated between the risk of cancers, including esophageal squamous cell carcinoma and gastric cancer, and the rs4135054 SNP in TDG [15]. This study was conducted to determine the association of the DNA repair gene TDG SNPs and colon cancer risk in the Saudi population.  20), rs1866074 (C 3152280 10), and rs1882018 (C 11490839 10). The preliminary data on the SNPs are shown in Table 1. These SNPs were also selected based on literature reviews of SNP associations with various diseases in diverse ethnic groups. The genotyping analysis was conducted using QuantStudio6 7 Flex Real-Time PCR System (Applied Biosystems) with an endpoint reading of the genotypes [16].

Results
A total of 100 colorectal cancer patients and 192 normal controls from a Saudi Arabia population were included in the present study. The clinical and the demographic features of the study subjects are described in Supplementary Table 1 (Suppl. Table 1). Both CRC and normal samples were classified based on demographic parameters such as age and gender. Colorectal cancer samples were further classified based on tumor location, namely, colon or rectum. The average age of the CRC samples was 57.10 ± 12.17 years and of the controls was 58.2 ± 8.34 years.
All six SNPs in the normal control and CRC patient group obeyed Hardy-Weinberg equilibrium (HWE) ( Table 1). Table 1 depicts the details of the SNPs used in the present study including the minor allele frequency and the HWE pvalue. Out of the six SNPs, two SNPs, rs4135113 and rs1866074, showed a significant association with colorectal cancer. The genotypic distribution of rs4135113 was 75% GG, 18% GA, and 7% AA in colorectal cancer patients and 82% GG, 16% GA, and 2% AA in normal samples. SNP rs4135113 (Gly199Ser) showed a significant risk association with colorectal cancer in Saudi patients for its genotype AA (OR: 3.640, CI: 1.034-12.819, p = 0.03286) ( Table 2). The frequency of the minor allele A in patient samples also showed a significant difference compared with that in the healthy controls (OR: 1.675, CI: 1.013-2.769, p = 0.04264) ( Table 2).
The genotypic distribution of rs1866074 was 22% AA, 39% AG, and 39% GG in colorectal cancer patients and 12% AA, 43% AG, and 45% GG in the normal samples. The GG allele frequency was low in colorectal cancer patients compared with that in the controls. SNP rs1866074 showed a protective association of the GG allele (OR: 0.501, CI: 0.251-1, p = 0.047) and the additive (AG+GG) allele (OR: 0.51, CI: 0.269-0.964, p = 0.036) ( Table 2). The remaining SNPs, rs4135050, rs4135066, rs3751209, and rs1882018, did not show any association with colorectal cancer in the overall analysis ( Table 2).

Stratification Analysis.
After an overall analysis, we compared the TDG genotype frequencies based on gender. The genotype distributions of male (n = 58) and female (n = 42) patients were compared with those of matched healthy individuals (Tables 3 and 4). Only rs3751209 showed a protective association in female colon cancer patients with    (Table 3). No other SNPs showed any significant association with colorectal cancer based on gender (Tables 3 and 4). The frequency of the A allele in patient samples also showed a significant difference compared with that of the healthy individuals (OR: 2.238, CI: 1.059-4.729, p = 0.03159).
The TDG genotype distribution was further correlated with the age at colon cancer diagnosis and tumor location. To assess the association of the analyzed SNPs with age at colon cancer diagnosis, we divided the patients into two groups based on the median age of the samples: ≤57 (n = 53) or >57 (n = 47) years of age. The distributions of genotype and allele frequencies for each SNP are shown in Tables 5  and 6. SNP rs4135113, which showed a significant association with CRC in the overall analysis, showed a significant risk association in CRC patients in the group of individuals above 57 years of age. The AA genotype frequency was higher in patients than in healthy individuals. This genotype showed a 5-fold increased risk of colon cancer in the Saudi Arabian population (OR: 5.588; CI: 1.032-30.254; p = 0.02745). In addition to this, the rs4135113 minor allele A also showed a 2-fold increased risk for colorectal cancer in the Saudi population (OR: 2.184, CI: 1.077-4.431; p = 0.02778) ( Table 6). A linkage disequilibrium analysis revealed that there was a difference in strength among the SNP associations in cases and controls (Figure 1).

Discussion
To the best of our knowledge, very few studies have been reported which correlate variation in the TDG gene with cancer [16][17][18]. With the aim of studying the role played by the polymorphisms in the TDG gene in CRC risk, we investigated six SNPs (rs4135113, rs4135050, rs4135066, rs3751209, rs1866074, and rs1882018) distributed in different regions of the TDG gene. The SNPs were selected based on their location in the TDG gene: rs4135113 is located in exon 5; rs4135050, rs4135066, and rs1882018 are in intron 1; and rs3751209 and rs1866074 are in intron 2 and intron 3, respectively. We chose these SNPs to study the effect of mutations in exons and introns. Mutations in an exon might affect the synthesized protein, whereas intron mutations might affect the RNA processing machinery and RNA splicing and stability, which could impact the level of expression and/or protein output [17]. Five of the SNPs were located in intronic region and four of them are in regulatory regions. SNPs rs4135066, rs4135050, and rs1882018 are located in aligned intronic regions flanking alternative conserved exon region (ACE). SNPs rs4135050 and rs1882018 are in exonic splicing silencer (ESS) region, and rs1866074 is in exonic splicing enhancer region. All six SNPs in the normal control and CRC patient group obeyed the Hardy-Weinberg equilibrium (HWE). Out of the six SNPs, two showed a significant association with CRC. SNP rs4135113 showed a significant risk association of its genotype AA (OR: 3.640, CI: 1.0341-2.819, p = 0.03286) 6 Journal of Oncology The SNP located in the coding region of the TDG gene, rs4135113, a G/A transition (missense mutation, Gly199Ser), was studied to detect if there was any association with CRC. There is recent evidence supporting an association between this polymorphism and the development of cancer. Sjolund et al. [15] reported that the Gly199Ser polymorphism occurs in approximately 10% of the global population and the expression of TDG with the G199S variant in human breast epithelial cells might lead to an increased number of DNA double-strand breaks. Thus, it initiates and activates DNA damage that induces cellular transformation and chromosomal aberrations [18]. Our results showed that the A/A genotype variation increases the risk of CRC by approximately fourfold in Saudi patients and is statistically significant (OR= 3.64, p-value = 0.03) ( Table 2). Further investigation was conducted to explore the correlation of this polymorphic site with the clinicopathological factors and we observed that rs4135113 showed a fivefold increased risk in old aged patients. A study carried out by Wen-Bin and colleagues (2009) on a Chinese population showed a significant association of  [19][20][21]. We also investigated the effect of rs4135050 on the risk of CRC when the T was substituted by A. The genotype AA in our study showed an elevated CRC risk, although the difference was not statistically significant (Table 2). In an urban Puerto Rican population, the one-carbon nutrient status was not associated with the DNA uracil concentration in this SNP [22].
The SNP rs4135066 has the C substituted by a T. In this investigation, the homozygous TT showed an increased risk of CRC; however, this was not statistically significant ( Table 2). A recent study by Barry et al. in an American population showed that the SNP rs4135066 was not statistically associated with prostate cancer [23]. In the rs3751209 polymorphism, the A/G variation in our study showed a reduction in the CRC risk, but the difference did not reach statistical significance (Table 2). A recent study by Osorio et al. showed that this SNP was not associated with breast cancer risk in BRCA1/2 mutation carriers [14]. Another SNP studied was rs1866074, which is located in the intronic region and results from a transition mutation where A is substituted by G. A recent case control study showed that the increase in the frequency of micronuclei in bladder cancer among the AG and GG carriers improved patient  prognosis [24]. In this investigation, we observed that the GG genotype and the AG+GG additive genotype decreased the risk of CRC (Table 2). Finally, rs1882018 was studied during this investigation, which is also located in the intronic region and is produced as a result of a transition mutation where A is substituted by G. Our results showed that the GG genotype increased the risk of BC, but the finding did not reach statistical significance (

Conclusions
In conclusion, the present study showed a significant association between the TDG gene and colorectal cancer progression in a Saudi population. One of the six TDG SNPs showed an increased risk of colon cancer. TDG rs4135113 increased the risk of colon cancer development by more than 3.6-and 1.6-fold in CRC patients in general, and 5-fold in patients aged more than 57 years. SNP rs1866074 showed a significant protective association in CRC patients. The GA genotype of TDG rs3751209 showed a decreased risk of CRC in males. Thus, there is a significant relationship between TDG gene function and colorectal cancer progression. However, further studies are required to determine the exact effect of amino acid (Gly199Ser) replacement using in vitro methods.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
All authors declare no conflicts of interest.