Age-Dependent Cancer Risk Is Not Different in between MSH2 and MLH1 Mutation Carriers

Lynch syndrome is mostly characterized by early-onset colorectal and endometrial adenocarcinomas. Over 90% of the causal mutations occur in two mismatch repair genes, MSH2 and MLH1. The aim of this study was to evaluate the age-dependent cancer risk in MSH2 or MLH1 mutation carriers from data of DNA diagnostic laboratories. To avoid overestimation, evaluation was based on the age-dependent proportion of mutation carriers in asymptomatic first-degree relatives of identified mutation carriers. Data from 859 such eligible relatives were collected from 8 centers; 387 were found to have inherited the mutation from their relatives. Age-dependent risks were calculated either using a nonparametric approach for four discrete age groups or assuming a modified Weibull distribution for the dependence of risk on age. Cancer risk was estimated starting at 28 (25–32 0.68 confidence interval) and to reach near 0.70 at 70 years. The risks were very similar for MSH2 and MLH1 mutation carriers. Although not statistically significant, the risk in males appeared to precede that for females by ten years. This difference needs to be investigated on a larger dataset. If confirmed, this would indicate that the onset of the colonoscopic surveillance may be different in male and female mutation carriers.


Introduction
Many genetic disorders have been found to exhibit a simple Mendelian inheritance pattern, and advances in the knowledge about their genetic basis have led to the expansion of DNA testing both for diagnosis and prediction of disease susceptibilities. In the case of mutations associated with an increased risk of common cancers, one parameter of major practical importance is the age-dependent cancer risk, defined as the risk for a mutation carrier of developing a tumor prior to a given age. Indeed, a precise knowledge of this parameter is instrumental in the counseling of individuals who are identified as carriers by genetic testing and who are faced with different options for cancer prevention or early detection.
A number of methodologies have been developed to estimate penetrance and the lifetime risk or recurrence risk of cancer prone individuals. Often, the first evaluation of cancer risk is performed from symptomatic individuals identified in large pedigrees used in linkage studies [1]. It has been shown that such design leads to a severe overestimation. Alternative evaluation methods, which tend to reduce such biases, include population-based studies and/or prospective follow-up of unaffected mutation carriers. These approaches, however, may be expensive, time consuming, and may require long follow-up in order to provide sufficient information. HNPCC, also known as the Lynch syndrome, is an autosomal dominant condition caused by mutation in one or several genes involved in DNA mismatch repair (MMR) [2]. Mutation carriers have been shown to be at high risk to develop colorectal and endometrial adenocarcinomas. In addition, significantly increased risks have been reported for cancers of small bowel, upper urological tract, stomach, ovary, and biliary tract [3]. Although at least four MMR genes (MSH2, MLH1, MSH6, and PMS2) have been implicated in Lynch syndrome, more than 90% of the causative mutations have been identified in two of them, MSH2 and MLH1. It has been estimated that the prevalence of mutations in these two MMR genes in the general population of European origin is between 1 of 500 and 1 of 1000 [4]. The prevalence in colorectal cancer patients is 2.7% [4]. In studies where ascertainment of Lynch families was not corrected, the estimated lifetime risk of colorectal cancer ranges from 0.68 to 0.82.
A precise knowledge of the age-dependent risk of cancer for individuals with deleterious MSH2 and MLH1 mutations is helpful in the identification and clinical management of families at high risk of colorectal and endometrial cancers. However, it has been recognized that evaluation of the cancer risk of HNPCC individuals performed from symptomatic patients referred to a cancer family clinic leads to overestimation [1,5,6]. We had previously briefly delineated an evaluation method which may be less sensitive to recruitment bias. It is based on the age-dependent proportion of mutation carriers observed in asymptomatic offspring of mutation carriers which when applied to 267 individuals led to an evaluation of the age-dependent risks of first cancer to be approximately 0.43 at age 38 and 0.62 at age 51 in mutation carriers [7]. The recent development of cancer family clinics offers the potential to generate a large amount of data, thus providing the opportunity of improving evaluations of cancer risk. Here, we more explicitly present the method and provide an example of its application by studying data from a total of 859 asymptomatic offspring of mutation carriers, distributed over an extended range of age, that have been collected through the contribution of hospital laboratories which perform genetic testing of MSH2 and MLH1 genes in France and Switzerland. We also show that the number of observations has to be substantially increased in order to provide precise estimates.

Patients.
A retrospective questionnaire was sent to eight genetic units which offer germline analysis of MSH2 and MLH1 genes under a Health Ministry agreement in France and Switzerland. This questionnaire asked, for each genetic test performed on asymptomatic offspring of mutation carriers, the following information: disease causing germline mutation identified in the proband using the international mutation nomenclature (http://www.hgvs.org/mutnomen/), birth date, sex, and age at genetic diagnosis. The questionnaire was fulfilled by the biologist having validated the predictive tests. No follow-up data of the corresponding atrisk relatives was required.

Genetic
Testing. In all laboratories that provide data, the presence or absence of the disease causing mutation was assessed on genomic DNA extracted from two independent blood samples according to the French and Swiss rules for examination of the individual genetic characteristics. Depending on the mutation type, point mutation, or large genomic rearrangement found in the proband, either by DNA sequencing or by quantitative, fluorescent multiplex PCR [8] was performed.

Risk Calculation.
The method is based on the determination of the age-dependent proportion of mutation carriers observed in asymptomatic first-degree relatives of mutation carriers at the time of the genetic test. No question about survival of mutation carrier is addressed in this work. The probabilities at birth for a first-degree relative of a mutation carrier to be either mutation carrier or nonmutation carrier are approximately equal and will be assumed equal in the rest of the study. For these two groups, the proportion of individuals that become symptomatic with age differs. Let Π ng (t) and Π gc (t) be the probability of a nonmutation carrier and of a mutation carrier to be affected by cancer before age t, respectively. Π gc (t) is also called cancer risk. The proportions r ng and r gc of asymptomatic individuals that can still be observed at age t are r ng = 1 − Π ng (t) and r gc = 1 − Π gc (t). Therefore, in proportion of mutation carriers at age t in a group of asymptomatic first-degree relatives of mutation carriers; p(t) = r gc /(r ng + r gc ), it follows that It is, therefore, possible to evaluate the age-dependent increased risk of mutation carriers (and, therefore, cancer risk) from the age-dependent risk of nonmutation carriers and the age-dependent proportion of mutation carriers among asymptomatic first-degree relatives of mutation carriers. In the rest of this work, we will assume that Π ng (t) remains small so that we have the approximate relationship: A nonparametric estimate of the cancer risk can be obtained is an estimation of p(t) approximated by N as /(N ng + N as ), where N as (and N ng ) is the number of mutation carriers (and noncarriers) sampled at age between [t − Δt, t + Δt]. Thus, π gc (t) is estimated as (N ng − N as )/N ng , with variance N as (N ng + N as )/N 3 ng according to the δ method [9].
A parametric estimate based on the logistic regression model can also be proposed. Based on the definition of π gc (t) and p(t), we have the following important relationship: Assuming the following modified Weibull distribution for the cancer risk: we observe that log This equation suggests the familiar logistic regression model. We can consider the following simple model: We can find the maximum likelihood estimate of (β, τ) by maximizing the likelihood function based on the above probability model. Once we have an estimate for (β, τ), we can obtain the estimate for π gc (t) for any given t. We call this estimate the parametric estimate. To obtain the confidence interval of the parametric estimate, we use the standard bootstrap method [10]. To compare the cancer risk between two groups, such as males and females, we used the likelihood ratio statistic that compares the likelihood assuming the same parametric model (i.e., the common (β, τ) for both samples) with the likelihood obtained by allowing (β, τ) to be varied between two samples. The statistical significance of the test can be evaluated through a permutation procedure by randomly shuffling the group ids (i.e., gender or gene name) among all subjects.

Results
Eight genetic units contributed information on a total of 859 asymptomatic offspring of mutation carriers: 581 from SO,   Table 1. For the youngest group, the risk was evaluated to zero. For the other four groups, the vertical line is placed at the median age and indicates the 0.68 confidence interval for the evaluated risk. The red curve shows the risk evaluated assuming that it follows a modified Weibull distribution as a function of age. The two flanking dotted graphs indicate the 0.68 confidence interval of this evaluation. The age at onset of the risk is evaluated to 28 (0.68 confidence interval = 25-32 years). functional tests in yeast, cosegregation analyses, and tumor cells studies. First-degree relatives of index cases carrying DNA variants of unknown significance were removed from this study. The 859 unaffected first-degree relatives of MSH2 and MLH1 mutation carrier were classified into five age groups (Table 1). When both the gender and the nature of the mutated gene (whether MSH2 or MLH1) are considered, the number of observations in each group is too small to enable a nonparametric evaluation of the cancer risk. Pooling together the four groups (MHS2, MLH1, males, and females), we attempted a nonparametric evaluation of the age-dependent cancer risk. It can be observed that as pooled age groups become older, the proportions of affected mutation carriers tend to decrease. This decrease is due to the removal from the study of the mutation carriers who become symptomatic. In the younger age group, the number of mutation carriers was larger than that of the nonmutation carriers, an observation that is likely due to the small number of observations and which suggests that cancer risk is very small in this age group. For the other groups, the proportion of mutation carriers was smaller than 0.5, and thus for these groups, a nonzero cancer risk could be estimated (Figure 1). For instance, for the age group between 50 and 60, the median age was 53, and the cancer risk was evaluated to 0.43. We note that the standard deviation of the present evaluation is large.
In an attempt to obtain a more precise evaluation, we performed a parametric estimate of the age-dependent cancer risk assuming that cancer risk would be negligible before an age threshold called τ, and starting from this age, cancer risk would increase according to a Weibull distribution with a parameter β. Weibull distributions are currently used in survival analyses and have been applied to parameterize age-dependent cancer risk [11]. Under this model, the maximum likelihood estimate of τ is 28 years (68% confidence interval 25-32 years). After this age, cancer risk rises rapidly and reaches a value of 0.48 (68% confidence interval 0.42-0.54) at 53 years, and 0.67 (68% confidence interval 0.59-0.74) at 70 years. After this age, the probability of a nonmutation carrier to have developed cancer becomes substantial so that the model may need to be corrected according to (1). The number of observations of nonsymptomatic fist-degree relatives older than 70 is small in our series, and no attempt was made to evaluate cancer risk after this age.
A similar method was applied separately on asymptomatic individuals with first-degree relatives carrying an MLH1 or an MHS2 mutation. The difference is risk for MSH2 and MLH1 mutation carriers appeared minimal (Figure 2(a)). When males and females were analyzed separately, the age-dependent risk for males appeared shifted by ten years as compared to females. Under the parametric model, the age at onset of the increased risk in males is 23 years (95% confidence interval 11-27) as it is 32 years (95% confidence interval 29-37) in females. However, a permutation test failed to demonstrate statistical significance (P = .15).

Discussion
There is a clear need to improve our estimation of the agedependent cancer risk for many genetic diseases. This is especially important for those conditions that predispose to cancer as this knowledge may influence the definition of the best surveillance protocol. The development of presymptomatic DNA diagnostic tests offers an opportunity to improve this knowledge. However, we need to apply methods that are less prone to biases than those based on the age at the onset of symptomatic individuals referred to cancer family clinics [11,12].
The evaluation method discussed in this paper requires a set of data that are collected in a two-stage process. In the first stage, symptomatic individuals are referred to a clinic and the deleterious mutations are identified. Importantly, the age at onset of the symptomatic individuals collected at this stage is not used to evaluate cancer risk as it is well known that such procedure may lead to major overestimation. In the second stage, asymptomatic first-degree relatives of individuals with an identified mutation are recruited and a test is conducted to determine their mutation status. Cancer risk is only evaluated from the age-dependent proportion of gene carriers in this group of asymptomatic first degree relatives. This method shares some of the potential biases that may be observed in population-based studies. The highly penetrant mutations are likely to contribute more than the low-penetrant mutations as the earlier are more readily detected than the former. Also mutations that lie in chromosomal regions that are investigated by routine DNA diagnostic techniques (e.g., mainly exonic point mutations or genomic large rearrangements) have been preferentially included in the study. Thus, the group of mutations that have been evaluated for cancer risk may not be representative of the mutation spectrum that is present in the population. However, for the group of mutations that have been identified, the method appears minimally biased. This lack of bias stems from the requirement that the individuals included in the study should be asymptomatic.
The present method requires a large number of observations. With the present dataset, variance of our estimation is large. It is barely informative when a nonparametric evaluation method is used (e.g., when individuals are pooled into 10-year age groups). The assumption of a modified Weibull distribution for cancer risk enables to decrease this variance at the cost of minimal hypotheses. Simulation studies indicate that the collection of a 4-fold increased the number of observations would decrease the confidence interval by a factor 2 (results not shown). With the development of presymptomatic DNA diagnostic tests, such number should be obtainable in the near future at little cost.
In the present work, we have applied the method to the evaluation of the age-dependent cancer risk of mutations in the MSH2 and MLH1 genes associated with Lynch syndrome. The resulting evaluation of the age-dependent cancer risk is consistent with those that have been previously published based on population-based studies [13]. It does confirm that the previous evaluation based on the age at onset of retrospectively included symptomatic individuals was overestimated. It also indicates that as previously proposed, cancer risk of MLH1 and MSH2 mutations is similar [14]. An analysis distinguishing gender also suggests that the onset of the increased risk occurs ten years earlier in males than in females. This observation, if confirmed, suggests that the colonoscopy surveillance in males may have to be started earlier than in females, possibly leading to changes in the standard guidelines [15,16]. Also, if it is confirmed that cancer risk is lower at all ages in females than in males, it would imply that the colorectal risk may be much smaller in females that in males since females are also at high risk of endometrial cancer. Similar observations have been recently published in the literature [12,14].
Besides Lynch syndrome, it would be of interest to apply this approach to predisposing diseases for which the first manifestations are not present at birth and which may have irreversible deleterious consequences when the diagnosis is delayed until symptomatic. This includes not only the cancer predisposing conditions such as those associated to BRCA mutations [17,18] but also possibly conditions associated with other degenerative processes (neurological or metabolic).