Analysis of Candidate Genes in Occurrence and Growth of Colorectal Adenomas

Predisposition to sporadic colorectal tumours is influenced by genes with minor phenotypic effects. A case-control study was set up on 295 patients treated for a large adenoma matched with polyp-free individuals on gender, age, and geographic origin in a 1 : 2 proportion. A second group of 302 patients treated for a small adenoma was also characterized to distinguish effects on adenoma occurrence and growth. We focussed the study on 38 single nucleotide polymorphisms (SNPs) encompassing 14 genes involved in colorectal carcinogenesis. Effect of SNPs was tested using unconditional logistic regression. Comparisons were made for haplotypes within a given gene and for biologically relevant genes combinations using the combination test. The APC p.Glu1317Gly variant appeared to influence the adenoma growth (P = .04, exact test) but not its occurrence. This result needs to be replicated and genome-wide association studies may be necessary to fully identify low-penetrance alleles involved in early stages of colorectal tumorigenesis.


Introduction
Colorectal cancer (CRC) is one of the most common human malignancies in Western countries. The majority of the cases develop from a premalignant lesion, the adenomatous polyp [1]. Colorectal adenomas have high malignancy potential when they are large in diameter and/or present with severe dysplasia and/or a villous component [2]. Colonoscopic polypectomy has been documented to significantly reduce the incidence of colorectal cancer [3,4]. Therefore, the identification of factors associated with the development of colorectal adenoma represents a major goal in colorectal cancer prevention. They could indeed allow the selection of individuals at risk of CRC who may benefit from a screening by colonoscopy.
The adenoma-carcinoma sequence suggests that colorectal adenomas and adenocarcinomas share common environmental and genetic risk factors. An increased risk of colorectal tumors has been found in relatives of patients with large adenomas [5,6]. A case-control study had suggested that family history of colorectal cancer influenced only the growth of adenomas or their malignant transformation [7]. However, relatively few epidemiologic studies explored genetic risk factors in colorectal adenomas.  We investigated, through a case-control study, the relation between polymorphisms within a series of candidate genes involved in colorectal tumorigenesis and putatively in the formation and the development of colorectal adenomas such carcinogen metabolism enzymes, methylation enzymes, DNA repair genes, oncogenes and tumor suppressor genes [8].

Constitution of the Patients and Control
Groups. The GEnetics of ADEnomas (GEADE) study is a case-control and family study of patients with high-risk adenomas (≥10 mm) [9]. The data were obtained from 18 participating gastroenterology units of general hospitals in France. From September 1995 to March 2000, 306 consecutive patients with newly diagnosed colorectal large adenoma (LA) were enrolled in the study. Patients with personal cancer history, familial adenomatous polyposis, established hereditary nonpolyposis colorectal cancer or inflammatory bowel disease were excluded. To distinguish genetic factors involved in the occurrence of adenomas or in their growth, 307 cases with small adenomas (with a diameter smaller than 0.5 cm) (SA) and 572 polyp-free controls (with normal colonoscopy) Journal of Oncology 3 (PF) were simultaneously enrolled in the same units. All patients and controls were of Caucasian origin. Reason for referral, family history of CRC, completeness of colonoscopy were registered for all patients and controls. Two PF per LA cases were selected as controls within over 2000 PF for matching on age, gender, and geographic area. Patients with SA were relatively rare and could not be matched with LA cases.
Blood specimens were obtained at time of colonoscopy and those patients who presented with a polyp were included only when histological examination revealed the adenomatous nature of the lesion. As polyps were totally removed during colonoscopy, their natural evolution could not be scored. After longitudinally section, half of the tumor material was fixed for histologic analysis and half was frozen for molecular characterization. Twenty individuals had to be excluded because of insufficient tumor material: 11 patients with LA, 5 with SA, and 4 PF. The final groups contained 295 patients with LA, 302 with SA, and 568 PF as controls. Details of these groups have been reported by Lièvre et al. [10]. All patients and controls signed an informed consent after approval of the study by an ethic committee for biomedical research (Le Kremlin-Bicêtre) and the database was declared to the national committee Commission Nationale de l'Informatique et des Libertés (CNIL).

Genes Studied and Genotyping
Procedure. Genes have been selected for their role in colorectal tumorigenesis and for the presence of frequent neutral polymorphisms. The polymorphisms of MMP1, MMP3, and MMP7 gene promoters, that are responsible for the degradation of extracellular matrix components, had been previously studied and were not considered in the present analysis [10]. Three genes, TS, UGT1A1, and MDR1, are implicated in folates, bilirubin metabolism, and transport of xenobiotic respectively. All these pathways are suspected to play a role in cancer occurrence or in the transformation of benign lesions, folates by interfering with DNA synthesis and repair, bilirubin by its known antioxydant properties, and transport of xenobiotic by its detoxication function. Case-control reports [8] exemplified the relevance of these three genes in addition to HRAS1. Three additional pathways have been secondary studied including the WNT pathway (APC, CTNNB1, AXIN2), the p53 pathway (TP53, BAX), the TGFB pathway (TGFBR2), the main MMR and BER genes (MSH2, MLH1, MYH), and CARD15 as part of the family with LRR domains. All these genes have been strongly assessed for their major role in oncogenesis, cycle cell control, apoptosis regulation, cell adhesion and migration, proliferation, DNA repair as their main functions.
Genotyping has been performed according to the nature of the polymorphisms. Single nucleotide polymorphisms (SNPs) have been characterized by Taqman analysis. The VNTR of HRAS1 was studied by conventional electrophoresis after PCR amplification on 1.2% agarose gels at 2 volts/cm for 15-18 hours.

Statistical Analysis.
Hardy-Weinberg proportions were tested for each polymorphism. Linkage disequilibrium (LD) between pairwise loci was estimated using the D [11] and r 2 [12] measures. Association was first tested for each polymorphism separately. Genotype-specific odds ratio (OR) and 95% confidence intervals (CIs) were computed using unconditional logistic regression adjusted on matching factors; Wald test was used to assess the global effect of each polymorphism. The homozygous genotype for the more frequent allele among controls was set as the reference class. Homogeneity of allele frequencies within geographic regions was previously checked. Tests of homogeneity and unconditional logistic regression were done using the SAS package software. The association was further examined using the combination test, that allows the analysis of all possible combinations of SNPs within a given gene or tightly linked genes to test their association with the disease [13]. For each SNPs combination, the method computes a statistic test contrasting the genotypic (or haplotypic) distribution between cases and controls. Because all these tests are correlated (many of them are nested in each other and the SNPs are likely to be in LD), a permutation procedure has been implemented which displays a significance level adequately adjusted for multiple testing. First, we used the FAMHAP12 software to apply this method by performing haplotypic tests [14]. Then the COMBINTEST (Jannot, personal communication) was used to perform genotypic tests. The method was extended to the test of combined polymorphisms on different genes that may act interactively. To avoid multiple tests biases, we chose a three-step strategy that minimizes the number of comparisons. First, we considered two-by-two combinations of polymorphisms chosen on their plausibility to interact with each other, for instance, APC and MYH both responsible, when mutated, for adenomatous polyposis. The second step consisted in considering all combinations within a same pathway. Finally, as a combination of polymorphisms may have an effect even if the genes are not suspected to interact with each other and are not involved in the same pathway, all possible combinations were considered if the first two steps did not reveal any significant association. When several combinations of polymorphisms were significant within a given test, it was possible to test nested combinations using Chi-square calculation to determine whether a given polymorphism adds a significant contribution to the association found [14].

Results
The average age was very similar in patients (62 years for LA and 61 years for SA) and in PF controls (61 years). The sex ratio (male/female) was quite similar in LA (1.7) and in PF controls (1.5), slightly lower in SA (1.2) because of the absence of matching. Table 1 describes the characteristics of the polymorphisms studied and the allele frequencies in controls. The distribution of genotypes in controls was consistent with Hardy-Weinberg proportions for all polymorphisms except for TP53.3 (P = .0013 without correction for multiple testing). A similar departure from H-W proportions (P = .010) was found in the HapMap European population (http://www.ncbi.nlm.nih.gov/projects/SNP/) for this polymorphism.
All polymorphisms within a same gene were moderately or not found in LD except MYH; MYH.1 and MYH.2 are in quasicomplete disequilibrium (D = 1 and R 2 = 0.99). As these two polymorphisms appeared to be equivalent, we analysed only MYH.1.
Analysis of single polymorphisms for the different comparisons, LA versus PF, LA versus SA, and SA versus PF did not reveal any association for most of the genes studied. Table 2 indicates the P-values obtained for each polymorphism. The P-values found to be less than .05 were obtained for APC.1, MDR1.2, and AXIN2. 9. The corresponding OR and 95% confidence intervals are given in Table 3.
The results of the haplotypic and genotypic combination tests performed for each gene or gene combination are shown  in Table 4. When considering each gene separately, only APC displayed a significant difference between small and large adenomas (P = .014 in haplotypic analysis and P = .07 in genotypic analysis). As APC.1 alone and the combination APC.1-APC.2 appeared both significant, we tested whether APC.2 provided a significant contribution to the association, which was not the case (χ 2 = 2.12, 1df) showing that the association was solely due to the APC.1 (p.Glu1317Gln) polymorphism. Logistic regression indicated an odds ratio of 7.95 (CI: 1.05-354.3). The combination APC-MYH appeared also significant with a global P-value of .045. However, the only significant combination was APC.1-APC.2 already found, showing that MYH does not provide any additional contribution to the association between APC and adenomas. No difference was found for the other two comparisons, that is, SA versus PF and LA versus PF. Finally, the analysis of all possible combinations did not provide any significant result.

Discussion
We investigated the role of candidate gene polymorphisms in a case-control study of patients with large adenomatous polyps (n = 295), compared to patients with small adenomatous polyps (n = 302) and polyp-free controls (n = 568). The 38 polymorphisms studied belonged to 14 genes possibly involved in increasing of colorectal cancer risk [8]. The reason for comparing these different groups of patients was that different genes might be involved in the different steps of carcinogenesis [7]. Using this case-control study, we had shown that MMP polymorphism was most probably involved in the occurrence, but not in the growth of polyps [10]. Using the combination test, we had shown that the effect was due to a specific combination of MMP1-MMP3 polymorphisms which was not found using logistic regression. Indeed, a major property of this test is that it allows the detection of polymorphisms with low marginal effects [13].
Multiple testing may be a limitation of studies in which several polymorphisms in several genes are tested. The combination test allows a correction for multiple tests on tightly linked markers, which other methods of correction such as Bonferroni, Benjamini-Hochberg, or FDR (false discovery rate) do not. In the present study, we could show that the positive results obtained for AXIN2 and MDR1 when analyzing single polymorphisms were false positives as these results were not confirmed when using the combination test. For independent markers, there was no need for correcting for multiple testing as our results were essentially negative.
Our study did not give evidence for an effect of any of the gene polymorphisms studied except the p.Glu1317Gly variant of the APC gene. This variant was already described in the previous studies with apparently conflicting results. Some studies reported an effect of this variant in patients with multiple adenomas or colorectal cancer [15][16][17], and some others found an effect in adenomas but not in cancer [18,19]. More recently, Hahnloser et al. [20] reported an effect of this variant in a group of patients with a small number of adenomas (1 to 3 lifetime adenomas) and in a group of CRC cases. None of these studies considered the size of adenomas and could differentiate adenomas at high-risk from those at low-risk of cancer. Our results favour the hypothesis that the p.Glu1317Gly variant would influence the growth and not the occurrence of adenomas, and would have thus indirectly an influence on colorectal cancer risk. Nevertheless, although the three groups are paired on gender and geographic origin, it is not stated if other known risk factors would influence adenoma growth. It is thus important to plan a specific multivariate analysis when building a replication sample.
Regarding the TS polymorphisms, we did not confirm the effect of TS suggested in some studies. Chen et al. [21] found a significant effect of the 3R allele of TS.2, particularly of the 3R/3R genotype (OR = 3.3). In our study, the OR associated with 3R/3R genotype was 1.43 (CI 0.94-2.16) for LA versus PF and 1.10 (CI 0.68-1.78) for LA versus SA. On the other hand, Ulrich et al. [22] found no marginal effect with any of these two polymorphisms but found a significant interaction of TS.2 with folate intake. Similarly, the most recent studies [23,24] found no marginal effect of either polymorphism but a slight interaction with folate or vitamin B6 intake. The groups' sizes in our study were of the same order of magnitude as in the study of Chen et al. [21], but the three other studies which did not show any marginal effect included larger groups (more than 500 individuals in each group). Given the size of our population, we could a priori detect an OR of 1.7 with a power of at least 80%. It is highly probable that the TS gene, which produces a key enzyme in folate metabolism, has an effect only in interaction with dietary exposures, which would explain why this effect was not found in our study. Moreover a new SNP has been recently identified in the second repeat of the 3R allele leading to split the allele 3R in 3RC and 3RG alleles as a tri-allelic polymorphism. This variant could explain these discrepant results since patients with 3RC/3RC genotype have a transcriptional activity of TS comparable to those with 2R/2R genotype [25].
In our study, the problem of multiple testing was adequately controlled for each test performed by the permutation procedure implemented in the combination test. In the last analysis step, the tremendous number of comparisons resulted in a drastic correction of each P-value, which considerably lowered the testing power. Thus, our negative results certainly do not definitively demonstrate the absence of specific combination influencing colorectal tumorigenesis, but more probably that, if it exists, this (of these) effect(s) is (are) low and could be found through the initial (and further) genome-wide association studies of colorectal cancer testing much larger numbers of patients and controls [26,27].