Epidemiology of Microsatellite Instability High (MSI-H) and DeficientMismatchRepair (dMMR) inSolidTumors:AStructured Literature Review

Background. Given limited data on the epidemiology of MSI-H and dMMR across solid tumors (except colorectal cancer (CRC)), the current study was designed to estimate their prevalence. Materials and Methods. A structured literature review identified English language publications that used immunohistochemistry (IHC) or polymerase chain replication (PCR) techniques. Publications were selected for all tumors except CRC using MEDLINE, EMBASE, and Cochrane databases and key congresses; CRC and pan-tumor genomic publications were selected through a targeted review. Meta-analysis was performed to estimate pooled prevalence of MSI-H/dMMR across all solid tumors and for selected tumor types. Where possible, prevalence within tumor types was estimated by disease stages. Results. Of 1,176 citations retrieved, 103 and 48 publications reported prevalence of MSI-H and dMMR, respectively. Five pan-tumor genomic studies supplemented the evidence base. Tumor types with at least 5 publications included gastric (n� 39), ovarian (n� 23), colorectal (n� 20), endometrial (n� 53), esophageal (n� 6), and renal cancer (n� 8). Overall MSI-H prevalence (with 95% CI) across 25 tumors was based on 90 papers (28,213 patients) and estimated at 14% (10%–19%). MSI-H prevalence among Stage 1/2 cancers was estimated at 15% (8%–23%); Stages 3 and 4 prevalence was estimated at 9% (3%–17%) and 3% (1%–7%), respectively. Overall, dMMR prevalence across 13 tumor types (based on 54 papers and 20,383 patients) was estimated at 16% (11%–22%). Endometrial cancer had the highest pooled MSI-H and dMMR prevalence (26% and 25% all stages, respectively). Conclusions. *is is the first comprehensive attempt to report pooled prevalence estimates of MSI-H/dMMR across solid tumors based on published data. Prevalence determined by IHC and PCR was generally comparable, with some variations by cancer type. Late-stage prevalence was lower than that in earlier stages.


Introduction
DNA mismatch repair (MMR) is a process that plays a key role in maintaining genomic stability by recognizing and repairing base-base mismatches and insertion/deletion of DNA generated during replication and recombination. Defects in MMR are associated with genome-wide instability and the progressive accumulation of mutations, especially regions of simple repetitive DNA sequences known as microsatellites, resulting in MSI. MSI-high (MSI-H) is a hypermutable phenotype that allows mutations to be accumulated rapidly, resulting in tumor development via the selection of cancer-promoting mutations in pathways that are responsible for maintaining functional DNA repair, apoptosis, and cell growth.
To test for MSI-H and dMMR statuses in solid tumors, polymerase chain reaction (PCR) and immunohistochemistry (IHC) methods have been widely accepted as respective testing platforms for these biomarkers. e PCR method uses a panel of microsatellite markers to detect size shifts in different loci. e IHC method uses a more direct test to determine the presence of MMR proteins. A tumor is typically classified as MSI-H if shifts are detected in at least 2 of 5 loci using the PCR method and dMMR if at least one MMR protein is absent using the IHC method. e use of NCI (BAT-25, BAT-26, D2S123, D5S346, and D17S250) [1] and Promega (BAT-25, BAT-26, NR-21, NR-24, and MONO-27) [2] panels in PCR and the use of MLH1, MSH2, MSH6, and PMS2 proteins in IHC are considered the gold standard approaches [3,4,5].
Among patients diagnosed with metastatic cancer and MSI-H or dMMR, prognosis is generally poor [6]. Recently, evidence has mounted on the benefits of immunotherapy, especially with checkpoint inhibitors such as pembrolizumab on MSI-H/dMMR tumors [7,8,9]. Historically, most patients with a solid tumor diagnosis were not tested for MSI; a better understanding of MSI-H and dMMR prevalence can help estimate the size of the potential target population. To provide reliable estimates of MSI-H and dMMR prevalence, a comprehensive structured literature review was conducted to gather relevant and recent evidence on the epidemiology of MSI-H and dMMR across multiple tumors. When sufficient data were available, meta-analysis was performed to estimate the prevalence of MSI-H and dMMR tumors overall, across individual tumor types, and by stage of disease.

Methods
Study eligibility criteria outlined in Table 1 guided study identification and selection for the literature review.

Literature Review.
Relevant studies were identified by searching the following through the Ovid platform: Medical Literature Analysis and Retrieval System Online (MED-LINE), Excerpta Medica database (Embase), and Cochrane Central Register of Controlled Trials. Predefined search strategies were executed on October 26 th , 2017. Study design filters recommended by the Scottish Intercollegiate Guidelines Network (SIGN) were used. Population terms were adapted from published research [9]; no intervention or comparator terms were used. Systematic reviews, meta-analyses, and key narrative reviews of interest were identified via hand searching. Targeted hand searches were conducted to identify colorectal cancer (CRC) studies and pan-tumor genomic studies reporting MSI-H/dMMR prevalence. Studies for all solid tumors except CRC were selected through database searches; CRC and pan-tumor genomic studies were selected through a targeted review. One reviewer reviewed all abstracts and proceedings identified through database searches and the targeted review according to the selection criteria. Studies identified as potentially eligible during abstract screening were screened in full-text by the same reviewer. e full-text studies identified at this stage were included for data extraction. e process of study identification and selection are summarized with Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) flow diagrams [10].
One reviewer extracted data on study characteristics, interventions, patient characteristics, and outcomes from included studies. e second reviewer independently extracted data from a random 10% of the publications, reconciled the data, and determined the error rate and missing data rate of data extraction by the first reviewer. e error rate (number of cells with incorrect data/number of cells with text) was 2.9%, and the missing data rate (number of cells with missing data/number of blank cells) was 1.2% (an error rate greater than 5% would have triggered extraction of a further 10% of publications by the second reviewer). All errors discovered through this process were corrected. Potential publication biases were checked through funnel plots. Data were stored and managed in a Microsoft Excel workbook.
Only studies that used PCR or IHC methods were included in this review. To increase validity of the metaanalysis, only studies that used NCI (BAT-25, BAT-26, D2S123, D5S346, and D17S250) or Promega (BAT-25, BAT-26, NR-21, NR-24, and MONO-27) panels in PCR and MLH1, MSH2, MSH6, and PMS2 proteins in IHC were included in the meta-analysis. e only exceptions were pan-tumor genomic studies, which used large-scale sequencing techniques to test for the presence of only the MLH1 gene.
ese genomic studies were included in sensitivity analyses to detect their potential effect on the meta-analysis.
Prevalence of MSI-H and/or dMMR was extracted overall, by tumor type, histology, stage, and country.

Meta-Analysis.
Reported proportions were transformed according to the Freeman-Tukey variant of the arcsine square root (double arcsine) transformed proportion [11]. e pooled proportion was calculated by back-transforming the weighted mean of the transformed proportions, using the DerSimonian-Laird random effects model [12].
Meta-analysis was conducted using the metafor package version 1.9-9 in R 3.4.0. Weighting of each tumor type was based on cancer-specific prevalence estimates provided by the GLOBOCAN 2012 database from the World Health Organization [13]. For rare tumor types, when data were unavailable on the GLOBOCAN database, other databases and peer-reviewed publications were referenced [14][15][16][17][18]. Each tumor type was assigned a weight based on its general prevalence; in cases where two or more studies were included for a given tumor type, weight was split proportionally between studies based on the sample size.

Results
e study selection process for identification of studies reporting MSI-H or dMMR prevalence in the structured literature review is outlined in Figure 1. Overall, 1,176 publications were assessed for eligibility; a total of 156 fulltext publications were included based on the structured and targeted literature review.         Beyond the 6 main tumor types feasible for tumorspecific meta-analyses, 19 other tumor types were included in the meta-analysis of overall MSI-H prevalence. Overall, MSI-H prevalence differed considerably across tumor types. A low prevalence of 2% (95% CI, 0%-8%) was observed in Ewing sarcoma [19], while a much higher prevalence of 35% (95% CI, 15%-57%) was reported in sebaceous tumors [20]. Small bowel [21] and cervical tumors [22] had prevalence of 12% each, which were very close to the all-tumor estimate. Figure 2. Prevalence estimates, 95% confidence intervals, and number of studies included in each analysis are shown. Meta-analysis results obtained from the random effects model in all tumor types are presented as forest plots in the Supplementary information (Appendix Figures 1-Figure 26). Funnel plots obtained from each meta-analysis are also presented in the Supplementary information (Appendix Figure 27-Figure 44). e weighted prevalence of MSI-H without genomic studies was estimated to be 14% (95% CI, 10%-19%) across all tumor types and stages. e prevalence was 10% (95% CI, 7%-14%) when four of the five large pan-tumor genomic studies were included (one genomic study was excluded as it did not report the total number of patients or the number of patients with MSI-H). Overall weighted dMMR prevalence was estimated to be 16% (95% CI, 11%-22%) across all

Discussion
is structured literature review and meta-analysis investigated MSI-H and dMMR prevalence across tumor types and compared prevalence estimates by tumor type, tumor stage, and country subgroups. Analysis results estimated the prevalence of MSI-H across all tumor types as 14% (95% CI, 10%-19%). dMMR prevalence was comparable at 16% (95% CI, 11%-22%).
Pooled dMMR prevalence estimates by tumor type were similar to those for MSI-H. It has been suggested that, for Lynch syndrome testing, PCR testing may be less sensitive than IHC due to the fact that mutations in MSH6 may present as MSI-L [23]. e results of this review, however, suggest that MSI-H and dMMR IHC testing results are generally comparable. e United States had higher MSI-H prevalence than Korea and Japan, but this result is possibly biased due to the lack of weighting for country-specific tumor prevalence.
Subgroup analysis indicated that early stage diseases (stage 1 and 2) tended to have a higher MSI-H prevalence than later stages (stages 3 and 4). Numerous studies have established the value of MSI status as a prognostic factor [24][25][26]. Results of a meta-analysis including 7642 patients indicated that MSI (MSI-H + MSI-L) tumors corresponded with significantly improved prognosis compared to MSS CRCs (overall survival HR 0.65 (95% CI, 0.59-0.71) [27].
is may partially explain the lower MSI-H prevalence in the later stages of cancers.
Some tumor types had noticeably higher MSI-H prevalence than others. Endometrial tumors had MSI-H prevalence of 26% (95% CI, 23%-29%), whereas renal tumors only had MSI-H prevalence of 1% (95% CI, 0%-2%). is observation corroborates findings from recent genomic studies, which revealed that the frequency of MSI-H events is highly variable across tumor types [13,28]. One study noted that MSI-H prevalence was highest in Lynch syndromeassociated tumor types (endometrial, colon, gastric, and rectal) [13] which is well-aligned with findings from the current study. e identified evidence base included 156 articles reporting on the prevalence of MSI-H and/or dMMR published between 1999 and 2017. is review includes the most cancer types of a published review to date. Of the other two known published meta-analyses that have quantified the prevalence of MSI-H for selective tumors, the first (including publications to 2007) reported an MSI-H prevalence of 12% (95% CI, 8%-17%) in ovarian tumors [29], the second (including publications to 2009) reported an MSI-H prevalence of 10% (95% CI, 6%-14%) in ovarian tumors [30], and the third (including studies published up to 2014) reported an MSI-H prevalence of 17% (95% CI, 15%-19%) in colorectal tumors [31]. e finding from the current metaanalysis suggests MSI-H prevalence of 11% (95% CI, 6%-18%) in ovarian cancer patients and 13% (95% CI, 10%-16%) in colorectal cancer patients, which are well-aligned with findings from previous meta-analyses.
is large-scale meta-analysis of the prevalence of MSI-H and dMMR used rigorous methodology in selection of testing methods, subgroup analyses, and incorporation of pan-tumor genomic studies in sensitivity analyses. First, this meta-analysis of MSI-H and dMMR prevalence included the   most number of studies (156) to date. Second, weighting techniques were used to adjust for overall tumor prevalence in order to prevent oversampling of commonly reported tumor types.
ird, only studies that utilized the "gold standard" MSI-H and dMMR testing methods were included in the meta-analysis, so the results from these studies were more comparable. Fourth, the subgroup analyses, which were stratified by factors such as tumor type, country, and disease stage, indicated which factors had potential association with prevalence. Fifth, the inclusion of pan-tumor genomic studies in the sensitivity analyses offered an alternative scenario and suggested that the testing method used in large-scale genomic studies (sequencing) is significantly different from the widely accepted methods (PCR and IHC) used in other included studies.
is meta-analysis has some limitations. First, the literature review for CRC was a targeted hand search; some potentially relevant publications may not have been identified. Studies were reviewed by a single researcher, but a quality check was performed to validate the dataset. An additional limitation was the heterogeneity of included study designs included, which included case control, cross-sectional, prospective cohort, and retrospective cohort studies. However, because of scarcity of the numbers in most cancer types, studies with different designs were included to maximize the data sources. Symmetry was observed on most funnel plots, which suggest a lack of publication bias. To address heterogeneity in study designs included in the metaanalysis, data were analyzed using fixed-and random-effects models; however, this exploration did not provide evidence of any specific source of heterogeneity. Finally, given the lack of MSI/MMR publications on a few major cancer types, the "overall" prevalence estimate does not include all solid tumors.
Recent evidence [32,33] supporting the role of MSI-H and dMMR, and associated immunogenicity as a mechanism for increased efficacy of PD-1/PD-L1 blockade in metastatic tumors with MSI-H or dMMR [8], demonstrates to the importance of increasing understanding [34] of prevalence across tumor type, stage, histology, and ethnicity.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Supplementary Materials
Meta-analysis results obtained from random effects model in all tumor types are presented as forest plots in the Supplementary information (Appendix Figures 1-Figure 26). Funnel plots obtained from each meta-analysis are also presented in the Supplementary information (Appendix Figure 27-Figure 44). (Supplementary Materials)