Quasimonomorphic Mononucleotide Repeats for High-Level Microsatellite Instability Analysis

Microsatellite instability (MSI) analysis is becoming more and more important to detect sporadic primary tumors of the MSI phenotype as well as in helping to determine Hereditary Non-Polyposis Colorectal Cancer (HNPCC) cases. After some years of conflicting data due to the absence of consensus markers for the MSI phenotype, a meeting held in Bethesda to clarify the situation proposed a set of 5 microsatellites (2 mononucleotide repeats and 3 dinucleotide repeats) to determine MSI tumors. A second Bethesda consensus meeting was held at the end of 2002. It was discussed here that the 1998 microsatellite panel could underestimate high-level MSI tumors and overestimate low-level MSI tumors. Amongst the suggested changes was the exclusive use of mononucleotide repeats in place of dinucleotide repeats. We have already proposed a pentaplex MSI screening test comprising 5 quasimonomorphic mononucleotide repeats. This article compares the advantages of mono or dinucleotide repeats in determining microsatellite instability.


Introduction
Microsatellites are repetitive nucleotide sequences distributed throughout the genome. They are highly polymorphic, and as a consequence are widely used to detect chromosome arms or fragments showing loss of heterozygosity (LOH) in human cancers. The observation in a subset of tumors of new microsatellite alleles absent in corresponding normal DNA led to the discovery some years ago of the so-called microsatellite instability phenotype [1][2][3] in tumors now referred to as MSI-H [4]. The presence of microsatellite instability is the hallmark of this phenotype and is found in about 10-15% of sporadic colon, gastric and endometrial tumors and in the majority of tumors from patients with the Hereditary Non-Polyposis colorectal cancer (HNPCC) syndrome [4]. It is due to a defect of the mismatch repair (MMR) system [5][6][7][8]. This issue of Disease Markers is completely devoted to MSI and HNPCC tumors and hence there is no need to convince here of the importance of rapid and reliable tests to define the MSI phenotype. We will instead compare the sensitivity, specificity and ease of use of the different microsatellites proposed to determine MSI tumors.
MSI in tumor DNA is defined as the presence of alternate sized repetitive DNA sequences that are not seen in the corresponding germline DNA. Many different microsatellites have been studied with the aim of identifying MSI tumors. Depending on the type (mono-, di-, tri-nucleotide) and number of microsatellites analyzed, widely variable results have been published for the frequency of MSI in different tumor types [9]. In 1997 an international consensus meeting proposed a panel of five markers for the uniform analysis of MSI [4]. This included two mononucleotide (BAT-25 and BAT-26) and three dinucleotide (D5S346, D2S123 and D17S250) repeats. Tumors with instability at two or more of these markers were defined as being MSI-H, while those with instability at one repeat or showing no instability were defined as MSI-L and MSS tumors respectively. These markers have been reevaluated at the last Bethesda consensus meeting held in December 2002 where the conclusions were there were caveats on their continued use due to the dinucleotide repeats. Our group has been heavily involved in the characterization of mononucleotide repeats and their use to determine the MSI phenotype [10][11][12]. We recently described a pentaplex PCR assay comprising 5 quasimonomorphic mononucleotide repeats [13]. We outline below the advantages of using mononucleotide repeats rather than dinucleotide repeats to determine the MSI status of tumors.

Di-and mononucleotide repeat polymorphisms in normal DNA
Most dinucleotide repeats are highly polymorphic. As a consequence, they frequently have different sizes between individuals and between both alleles of the same person.
Mononucleotide repeats in the 20-30 bp range are common in non-coding intronic and 5' and 3' UTR sequences of human genes [14]. A number of these repeats are polymorphic and will not be discussed here since they probably share similar properties concerning instability as dinucleotide repeats. For unknown reasons however, some mononucleotide repeats are monomorphic, or at least quasi-monomorphic. BAT-26 is the best known example of this type of repeat and our group was the first to show its quasimonomorphic nature [10]. We also showed that BAT-25 had similar properties [11,12] and more recently characterized the NR-21, NR-22 and NR-24 mononucleotide repeats [13]. Polymorphisms in each of these repeats are found in less than 1% of the Caucasian population [10,11,13,15] and in approximately 10% of African and Afro-American populations [13,15,16]. The ethnic variation of these repeats has not been fully reported up until now. It appears that BAT-26 is also monomorphic in Asian populations [17,18] with the exception of North Indians where it shows some genetic variation [19].

Methods used to analyze the MSI status with diand mononucleotide repeats
To analyze microsatellite instability with dinucleotide repeats the comparison of tumor DNA with matching germline DNA is mandatory. This allows identification of additional alleles in the tumor DNA as compared to normal DNA.
For the quasimononomorphic BAT-26, BAT-25, NR-21, NR-22 and NR-24 mononucleotide repeats, the size of the PCR products in any non-tumor DNA is by definition almost always the same. We proposed that a tumor should be classified as MSI-H when at least 3 out of 5 mononucleotide repeats show instability [13]. With an average polymorphism frequency of 1% and 10% for each mononucleotide marker in Caucasian and African populations respectively, the probability of having 3 polymorphic markers will be 10 −6 for Caucasians and 10 −3 for Africans. The analysis of matching normal DNA is therefore not an absolute necessity in order to establish the MSI status of human tumors when using mononucleotide repeats. 0-1.

Di-and mononucleotide repeats instability in MSI-H tumors
When unstable in an MSI-H tumor, one or both alleles of a dinucleotide repeat can be the target of instability. The consequence of this is the deletion or insertion of one or more DNA repeat units. When both alleles have a different size in a particular individual, the instability of one allele may by chance result in it having the same size as the other allele. In these situations, instability can be interpreted as loss of heterozygosity. Dinucleotide microsatellites are not always unstable in MSI-H tumors. In two reports that analyzed MSI status using the Bethesda markers, the sensitivities of the dinucleotide repeats were reported to be 85-89% for D2S123, 77-81% for D17S250 and 59-69% for D5S346 [20,21]. Moreover, dinucleotide microsatellite amplification profiles are sometimes difficult to interpret and there have been cases where three experienced reviewers reported discrepancies in scoring [20].
Quasimonorphic mononucleotide repeats are far more sensitive than dinucleotide repeats in detecting MSI. We have shown that BAT-26, BAT-25, NR-21, NR-22 and NR-24 each had sensitivities above 95% in a series of 64 MSI-H colon primary tumors [13]. We took a cut-off value of instability on 3 of the 5 mononucleotide repeats to define MSI-H tumors, but in fact, more than 99% of the MSI-H tumors we analyzed were unstable on 4 or 5 of the markers. That makes very unlikely to confuse a MSS tumors from a patient with multiple polymorphisms on the markers with a MSI-H tumor. Loukola et al. showed that BAT-26 and BAT-25 were unstable in 100% of 27 MSH2 or MLH1 mutationpositive HNPCC cases [20]. Moreover, there was not a single scoring discrepancy between three reviewers with BAT-26 and BAT-25. 0-2.

Di-and mononucleotide repeat instability in non-MSI-H tumors
Tumors showing instability at only one microsatellite are defined as MSI-L when using the Bethesda panel [4]. In most cases, the single unstable repeat is a dinucleotide. We have analyzed a series of 90 colon primary tumors and matching normal DNA with a large number of dinucleotide repeats (average successful amplifications was 65 repeats for normal/tumor DNA pairs). Of the 90 samples, 48 (53%) were unstable in at least one dinucleotide repeat, but in less than 50% of the repeats (13 MSI-H cases) [10]. Other groups have obtained similar results and it has been suggested that, if a great number of dinucleotide microsatellites is analyzed, all colorectal tumors would be classified as MSI-L [22,23]. Depending on which dinucleotide repeats are chosen in a small panel of microsatellites, a tumor could be classified as MSI-L or MSS. In fact, the existence of MSI-L tumors is still a matter of debate [24]. Real clinical differences between MSI-L and MSS tumors have not been reported, and MSI-L tumors (like MSS tumors) have never been demonstrated to have a mutation in any mismatch repair gene responsible for microsatellite instability. It appears that a given dinucleotide repeat may show instability in tumors which are not mismatch repair deficient. In some cases, it has even been reported that this fact is a non-reproducible PCR artifact due to the quality and/or quantity of DNA rather than real instability [25]. Although unstable on 2 out of the 5 microsatellites of the Bethesda panel, a tumor is not necessarily MSI-H if the two unstable microsatellites are dinucleotide repeats. This fact was acknowledged at the recent Bethesda consensus meeting. It is the main caveat of the original Bethesda panel of microsatellites, since it is now recommended to analyze more mononucleotide repeats in these particular cases.
As discussed farther, deletions in BAT-26 are proposed to be stepwise during tumor progression. The same should be true for other quasimonomorphic mononucleotide repeats. Instability of these repeats in MSI-H tumors is due to the accumulation of successive deletions during tumor progression. As a consequence, if such a repeat shows instability in non-MSI-H tumors due to a general instability phenomenon, it will be a short deletion that will not be considered to represent genuine instability when scoring MSI status. Thus, the use of mononucleotide repeats will never results in scoring of MSI-L tumors, nor it will falsely score a MSI-L tumor as a MSI-H one. 0-3.

Mutations in MSH6 and di-and mononucleotide repeat instability
The MSI-H phenotype is due to a defect in the cell mismatch repair system. Generally, this is a point mutation in the hMSH2 or hMLH1 genes in the HN-PCC cases, or methylation of the hMLH1 promotor in sporadic MSI-H cases. Other mismatch repair system genes have been reported to be altered in some cases and responsible for the MSI phenotype. The hMSH6 gene is one such example [26]. The mismatch repair system is composed of a number of proteins recognizing mismatches introduced by errors of the DNA polymerase during DNA replication. It is known that the components of these protein complexes are different according to the sizes of the deletions/insertions to repair, and hMSH6 is not involved in the mismatch repair of two or more bp. In other words, a tumor with a hMSH6 mutation is stable at dinucleotide repeats [26]. Accordingly, the analysis of dinucleotide repeats will not recognize such tumors.
The hMSH6 protein is specifically involved in the mismatch repair of nucleotide substitutions and 1 bp deletions or insertions [26]. Two cell lines mutated on hMSH6, HCT-8 and HCT-15, are unstable at BAT-26 and the other mononucleotide repeats [13]. Our pentaplex PCR reaction with 5 quasi-monomorphic mononucleotide repeats is thus sensitive for the detection of MSI tumors with mutations on hMSH6. 0-4.

DNA mislabeling and di-and mononucleotide repeat instability
All those involved in the MSI field have seen at least once in a published figure the dinucleotide PCR profiles of a tumor where one or two alleles are completely different to those of the compared matching normal DNA, and where there is no normal sized allele. Since primary tumors, as opposed to cell lines, are very rarely 100% pure even after an enrichment step, the complete absence of normal-sized alleles in these PCR profiles . For each marker, normal allelic size is indicated by shaded areas. PCR artefacts or contaminating bands are marked by asterisks, but they do not interfere with scoring mononucleotide repeats since 1) they are far from products with the same dye and 2) do not have the specific profiles of mononucleotide repeats containing PCR products.
raises the possibility of mislabeling of DNA samples. This fact has already been pointed out by Perucho in 1999 [9]. In our analysis of BAT-26 and of a large number of dinucleotide repeats in a series of 160 tumors and cell lines to compare their efficiency to determine MSI status, we had a single sample showing conflicting data [10]. This was a colorectal primary tumor unstable on dinucleotide repeats and not on BAT-26. After further investigation, we found that there were 2 tubes with the same number in our bank of frozen samples. Indeed the instability of dinucleotide repeats in this sample was because the tumor and normal DNA were not from the same individual. The same explanation is probably true for many of the supposedly matched tumor/normal samples showing very different PCR profiles, which are not due to instability but rather to sample mislabeling.
Due to the quasi-monomorphic nature of mononucleotide repeats, the analysis of matching normal DNA is not required and the above type of sample mixing cannot occur. In the second Bethesda meeting report, it will be reported that "dinucleotide repeats. . . may provide internal control for the prevention of sample mixup". We feel this is not necessary and propose to keep things as simple as possible. Indeed, when any analysis is done on patient samples, there is no control to check if the sample being analyzed is really that of the patient to be analyzed other than careful tracking of the samples by an appropriate coding system. The same could easily be achieved with DNA extracted from tumors without resorting to normal matching DNA and to dinucleotide repeats. The only necessary precaution is, as usual, to perform a control PCR without DNA to avoid potential PCR contamination problems. 0-5.

Additional information provided by di-and mononucleotide repeat instability
As far as we know, no additional information can be obtained by the analysis of dinucleotide repeats.
In contrast, mononucleotide repeats can provide significant additional information. It has been shown that shortening of the BAT-26 and BAT-25 alleles are progressive and concomitant [11,27,28]. In MSI-H adenoma, shortening of BAT-26 is less than in the corresponding MSI-H carcinoma [29]. It is known that MSI-H tumor progression is due to the accumulation of mutations in short coding repeats within genes involved in growth control and other important pathways [30]. The number of genes known to contain such mutations is increasing as recently reviewed [31]. We have defined a Shortening Index at Non Coding repeats (SINC) with BAT-26 and BAT-25 and showed it to be positively correlated with the accumulation of mutations in coding repeats, suggesting that it could be a molecular clock for tumor progression [30]. Moreover, the percentage of mutation for a given target gene for instability in MSI-H tumors can be very different between different studies [31]. We have demonstrated that BAT-26 andBAT-25 amplification profiles can indicate the percentage of contamination of a primary tumor sample by normal stromal cells (Brennetot et al. submitted for publication). We have suggested that highly contaminated tumor samples, as indicated by mononucleotide amplification profiles, should be enriched by microdissection or other methods prior to further molecular studies, particularly those involving screening for mutations in target genes for instability. All of the above, defined with BAT-26 and BAT-25, can be extended to NR-21, NR-22 and NR-24 giving even more precise information. 0-6.

Conclusions
0-6, game, set and match for the mononucleotide repeats!! We showed in this review that mononucleotide repeats are much more informative, sensitive, specific and easy to use than dinucleotide repeats to detect MSI-H tumors, without beeing hindered by MSI-L tumors whose real existence has yet to be proven.
We have already characterized 5 mononucleotide repeats, namely BAT-26, BAT-25, NR-21, NR-22 and NR-24, and proposed that concurrent use of these five microsatellites allows accurate evaluation of tumor MSI status with 100% sensitivity, 100% specificity and without the need to analyze corresponding normal DNA [13]. Moreover we have defined conditions to amplify all these in a single pentaplex PCR reaction making this detection a one-step procedure [13]. This assay is thus technically simpler to use than assays with dinucleotide markers. It also reduces the number of PCR amplifications from 10 to 1 making this test much less expensive. Although the PCR products of these 5 markers are labeled with different dyes, one can have some interference between the different markers if the laser detection system is imperfectly adjusted. To avoid this possible technical problem, we have defined new primers to obtain PCR products of non-overlapping sizes (see appendix).
As indicated by the last Bethesda consensus meeting, the set of 5 quasimonomorphic mononucleotide repeats defined here is likely to provide the best option described so far for determining the MSI status of sporadic or hereditary human tumors. This method does not require new specific equipment, and is not only more sensitive and specific, but is also lower time and cost consuming than previously used methods.

Aknowledgements
We thank Dr Barry Iacopetta for critical reading of the manuscript.
The sizes of the PCR products of the 5 mononucleotide repeat markers described by Suraweera et al. [13] are 121, 124, 104, 143 and 134 bp for BAT-26, BAT-25, NR-21, NR-22 and NR-24 respectively. Due to an average deletion of 5-12 bp for these markers in MSI tumors, PCR products overlap. Our aim was to shift the primers so that each amplification product would have sizes differing by at least 20 bp from each other, allowing a clear separation of each of them, even when deleted due to microsatellite instability in tumor DNA. Due to this change, any potential scoring problem due to an imperfect adjustment of the laser detection problem will be eliminated.
We were not able to set up good new conditions with marker NR-22. Alternatively, another quasimonomorphic mononucleotide repeat termed NR-27 was used with success together with BAT-26, BAT-25, NR-21 and NR-24. Genes names, primers sequences, size of the PCR products and accession numbers of the corresponding cDNA are given Table 1.
In each case, the anti-sense primer was labeled with a fluorescent dye: FAM for BAT-26 and NR-21, HEX for BAT-25 and NR-27, and NED for NR-24.
BAT-26, BAT-25, NR-24, NR-21 and NR-27 amplify in a standard multiplex PCR with an annealing temperature of 55 • C. PCR products are 183, 153, 131, 109 and 87 bp respectively when normal DNA (or MSS tumor DNA) is amplified. There is thus, as we requested, a minimum size difference of at least 20 bp between each PCR product. Figure 1 shows amplification profiles obtained with one MSI-H cell line, one MSS cell line and two MSI-H primary colorectal tumors using these new conditions.