Long-Read Nanopore Sequencing Identifies Mismatch Repair-Deficient Related Genes with Alternative Splicing in Colorectal Cancer

Background Alternative splicing (AS) plays a crucial role in regulating the progression of colorectal cancer (CRC), but its distribution remains to be explored. Here, we aim to investigate the genes edited by AS which show differential expression in patients with mismatch repair deficiency (dMMR)/microsatellite instability (MSI). Materials and Methods We applied long-read nanopore sequencing to determine the mRNA profiles and screen AS genes using Oxford Nanopore Technologies (ONT) method in ten paired CRC tissues. CRC tissue and plasma samples were used to validate the differential genes with AS using real-time fluorescent quantitative PCR, immunohistochemistry, and enzyme-linked immunosorbent assay. Results ONT sequencing identified 404 genes were downregulated, and 348 genes were upregulated in MSI cancer tissues compared with microsatellite stability (MSS) cancer tissues. In total, 6,200 AS events were identified in 2,728 mRNA transcripts. WGCNA revealed dMMR/MSI-correlated gene modules, including INHBA and RPL22L1, which were upregulated; conversely, HMGCS2 was downregulated in MSI cancer. Overexpression of RPL22L1, INHBA, and CAPZA1 was further confirmed in CRC tissues. INHBA was found to be associated with tumor lymphatic metastasis. Importantly, the levels of INHBA in CRC plasma were significantly increased compared with those in noncancer plasma. INHBA showed a higher level in dMMR/MSI CRC than in MSS CRC, indicating that INHBA is a useful biomarker. Conclusion Our results showed that ONT-identified genes provide a pool to explore AS-associated markers for dMMR/MSI CRC. We demonstrated INHBA as a promising signature for clinical application in predicting tumor lymphatic metastasis and screening dMMR/MSI candidates.


Introduction
Colorectal cancer (CRC) is one of the most important malignant gastrointestinal tumors and has demonstrated high rates of mortality and morbidity [1]. CRC is classified as mismatch repair deficient (dMMR) or microsatellite instability (MSI) by the absence of at least one of four mismatch repair proteins (MLH1, MSH2, MSH6, or PMS2) [2]. The characteristics of MSI CRC include a high tumor mutational burden, generating hyperneoantigens that are responsible for favorable clinical responses to immune checkpoint blockade (ICB) [3]. However, approximately 10~15% of CRC cases display dMMR/ MSI [4].
Currently, dMMR/MSI screening is helpful to select the most appropriate ICB for CRC and is also used as a biomarker to predict the therapeutic response to chemotherapy [5]. Patients with metastatic CRC (mCRC) that are high in dMMR/MSI are less responsive to conventional chemotherapy with or without drugs targeting VEGF or EGFR [6]. The KEYNOTE-177 study showed that pembrolizumab (anti-PD1) monotherapy provides a clinically meaningful improvement in progression-free survival (PFS) compared with chemotherapy for dMMR/MSI mCRC (median PFS: 16.5 months vs. 8.2 months) [7]. Additionally, mCRC patients with dMMR/MSI treated with first-line pembrolizumab showed better health-related quality of life than those treated with chemotherapy [2].
Cancer-associated mutations in genes encoding RNA alternative splicing (AS), such as MLH1, commonly occur in CRC [8]. AS is a rich source of tumor-specific neoantigen targets for ICB [9]. To some extent, MSI patients can benefit from ICB depending on large numbers of immunogenic neoantigens that can be recognized by immune cells [10]. Mutations in MMR proteins confer an increased lifetime risk of cancer development in affected individuals, while neoantigens result from the hypermutable nature of dMMR [11]. AS is partly responsible for transcript variation and proteome diversity [2]. MLH1 deficiency triggers chromosomal abnormalities and the release of nuclear DNA into the cytoplasm, resulting in the activation of the cyclic GMP-AMP synthase-(cGAS-) STING pathway, whose key molecules undergo AS [12]. cGAS acts as a specific pattern recognition receptor and is activated by type I interferons (IFNs), which mediate the Janus kinase-(JAK-) signal transducer and activator of transcription (STAT) pathway [13]. Through AS, STAT3 is translated into two distinct isoforms, STAT3α and STAT3β [14]. Therefore, identification dMMR-related AS genes would be helpful on either illustrating novel mechanism for ICB or screening biomarker for predicting therapeutic effect.
Comparing with microsatellite stability (MSS) CRCs, dMMR CRCs exhibited distinct immunological feature. Activation of cGAS-STING and type I IFN signaling promotes the overexpression of CCL5 and CXCL10 and recruitment CD8 + T cells into dMMR CRC TME [15]. CXCL10 is upregulated in homologous recombination-deficient tumors and binds to CXCR3 [16]. CXCR3 is primarily expressed on activated CD8 + T cells and is suppressed by transforming growth factor beta (TGFβ) [17]. TGFβ signaling is the major pathway that excludes immune cells from tumors. MSI CRC is unresponsive to TGFβ because of TGFBR2 mutations [18]. Thus, different molecular patterns exist between MSI and MSS patients, but their identities remain elusive.
High-throughput sequencing technology has become increasingly available to identify differential mRNA profiles or genomic variations. Nanopore sequencing technology with highly accurate quantification of transcripts is a thirdgeneration sequencing method recently developed to sequence full-length RNA transcripts as well as distinct AS transcripts [19]. In the present study, we performed transcriptome sequencing to survey the full-length CRC transcriptome landscape offered by Oxford Nanopore Technologies (ONT). We further compared the differential profiles between MSI and MSS tumors. Our data provide the first full-length MSI and MSS CRC transcriptome databases, allowing us to identify INHBA as biomarkers to discern among MSI CRC patients using blood plasma.

Patient Selection.
For ONT RNA sequencing, 20 fresh samples were collected from 10 CRC patients from 2020 to 2021 at Beijing Chao-Yang Hospital. These CRC patients enrolled five dMMR/MSI patients. dMMR/MSI was defined when any of these MMR proteins were completely absent in the cancer tissue but presented in adjacent benign tissue. Microsatellite stability (MSS) patients were defined when all of MMR (MLH1, MSH2, MSH6, and PMS2) were positive expression in both tumor and adjacent benign tissue. Immunohistochemistry (IHC) was assayed to detect MLH1, MSH2, MSH6, and PMS2 separately. Patients who showed dMMR were defined as MSI. For validation, 23 paired tumor samples and tumoradjacent normal fresh tissues were collected and stored at -80°C until to use. Forty-six normal intestinal tissues and 49 cancer samples were formalin-fixed and formalin-embedded and reassessed by two pathologists. Sixty-six peripheral plasma samples from CRC patients and 20 plasma samples from nontumor donors were separated and stored at -80°C until use. This study was approved by the Institutional Review Board at Beijing Chao-Yang Hospital. All the enrolled subjects signed written informed consent forms.
2.2. ONT Sequencing. One microgram of total RNA was prepared for cDNA libraries using a cDNA-PCR Sequencing Kit (SQK-PCS109; ONT). The products were then subjected to ONT adaptor ligation using T4 DNA ligase (NEB). The final cDNA libraries were added to FLO-MIN109 flow cells and run on the PromethION platform at Biomarker Technology Company (Beijing, China). Raw reads were first filtered (minimum average read quality score = 7 minimum read length = 500 bp). Ribosomal RNAs were discarded after mapping to the rRNA database. Clusters of full-length, nonchimeric (FLNC) transcripts and consensus isoforms were obtained after mapping to the reference genome (GRCh38_release95) with minimap2 (version 2.7-r654) in spliced alignment mode after polishing within each cluster using pinfish [20]. Mapped reads were further collapsed by the cDNA_Cup cake package with min-coverage = 85%and min-identity = 90%. The criteria for fusion candidates are that a single transcript must map to ≥2 loci (minimum coverage 5%, ≥1 bp; total coverage ≥ 95%, at least 10 kb). The transcripts were validated against known reference transcript annotations (hg38) with Gffcompare (version 0.1.26) [21]. Alternative splicing (AS) events, including IR (intro retention), ES (exon skipping), AA (Alternative acceptor), and MEE (mutually exclusive exon), were identified using the AStalavista tool (version 3.0) [22]. Simple Sequence Repeat (SSR) of the transcriptome were identified using MIcroSAtellite identification tool (version 2.1) [23]. Alternative polyadenylation (APA) analysis was conducted with TAPIS (Transcriptome Analysis Pipeline for Isoform Sequencing, version 1.2.1) [24]. Coding Sequence (CDS) were predicted by TransDecoder (version 5.5.0) [25]. The data were submitted to the NCBI SRA database and are accessible through the citation accession number PRJNA758110. For quantification, the expression levels were estimated by reads per gene/transcript per 10,000 reads mapped. Differential expression analysis was performed using the DESeq2 R package (version 1.6.3). These bioinformatic 2 Disease Markers analyses were performed by Biomarker Technology Company. Genes with an adjusted P value < 0.01 and fold change ðFCÞ ≤ −1:5 or ≥1.5 found by DESeq2 were assigned as differentially expressed. Gene Ontology (GO) enrichment analysis of the differentially expressed genes (DEGs) was implemented using the DAVID Bioinformatics Resources (version 6.8) tool [26] and plotted using GOplot R packages (version 1.0.2) [27]. GO terms included biological process (BP), cellular component (CC), and molecular function (MF). The top eight terms of biological process (BP) were statistically analyzed according to P value (<0.05) and FC ≤ −1:5 or ≥1.5. .org/). The R package "limma" [28] was used to screen for the differentially expressed genes between cancer and normal samples according to the criteria of an adjusted P value < 0.01 and fold change ðFCÞ ≥ ≤ − 1:5 or ≥1.5.

Estimating the Proportion of Immune and Cancer (EPIC)
and CIBERSORT Analysis. EPIC and CIBERSORT were used to estimate the fraction of immune cells of sequenced samples. EPIC was performed using an online tool (http://epic .gfellerlab.org/). The R package "CIBERSORT" (R script version 1.03) was applied to assess the fractions of the 22 types of tumor-infiltrating immune cells [29]. Outputs were considered to be accurate after screening to meet P value < 0.05.

2.7.
Enzyme-Linked Immunosorbent Assay (ELISA). The amount of INHBA in blood plasma was quantified using the Human INHBA ELISA Kit (#ml022686; mlbio, China) according to the manufacturer's instructions. Briefly, 50 μl of plasma was added to 96-well anti-INHBA-coated plates and incubated at 4°C overnight. After that, the samples were removed and washed with washing buffer, and then horseradish peroxidase-labeled antibodies were added and incubated at 37°C for 1 hour. Finally, substrate solution was added and terminated by adding stop solution. Each well was measured at an Optical Density (O.D.) of 450 nm using a microtiter plate reader within 15 minutes. The INHBA concentrations were calculated using a standard curve.
2.8. Statistical Analysis. We used unpaired Student's t-test and the Mann-Whitney test as indicated, and binary outcomes were compared. A P value less than 0.05 was considered statistically significant. Analysis was performed using GraphPad Prism software (version 8.0.1).

Long Read RNA Sequencing Establishes CRC-Related
Differential Profiles. To investigate the differential mRNA profiles of CRC, we used full-length RNA sequencing by ONT, which is a third-generation RNA-seq technique, and constructed transcriptome landscapes in 20 CRC samples, which were paired normal and cancer tissues ( Figure S1, Figure S2, Supplementary Table S1). ONT sequencing identified 8,566 transcripts with fold changes ≤ −1:5 or ≥1.5 and a P value < 0.01, encoding 4,179 genes ( Figure 1(a)). The downregulated genes numbered 2,632, and the upregulated genes numbered 1,547 in cancer tissues. Gene set enrichment analysis (GSEA) indicated that these genes are involved in the primary immunodeficiency, regulation of actin cytoskeleton, cellular response to stimulus, and focal adhesion (Figure 1(b)). Additionally, pathway enrichment analysis demonstrated that these genes were significantly enriched in DNA replication and cell cycle pathways (Figure 1(c)). To explore the biological function, Gene Ontology (GO) enrichment analysis was performed, and the eight top GO terms associated with    To validate our findings, we identified differential mRNAs in TCGA colorectal cancer datasets (COAD and READ) ( Figure S3A). In total, 5,121 and 4,924 differentially expressed genes were identified in both datasets. Additionally, 1,417 overlapping genes showed differential expression by either ONT sequencing or TCGA analysis ( Figure S3B). GO analysis was also used to enrich the biological processes for COAD differential genes, indicating that these genes were involved in GO terms concerning the chemokine-mediated signaling pathway, cellular response to tumor necrosis factor, and cellular response to interleukin-1 ( Figure S3C). For READ differential genes, GO terms concerning the chemokinemediated signaling pathway, cell chemotaxis, and chemotaxis were enriched as the most important biological processes for these genes involved ( Figure S3C). The above results showed that chemokines play a key role in regulating CRC progression.

dMMR/MSI Related Genes.
We further addressed the differentially expressed genes between MSI and MSS to screen the biomarkers associated with dMMR. ONT sequencing     Disease Markers analyzed five MSIs and five MSSs. In cancer tissues, 752 differentially expressed genes were identified; 404 genes were downregulated, and 348 genes were upregulated in MSI cancer tissues compared with MSS cancer tissues (Figure 2(a)). In normal tissues, 249 differential genes were identified in MSI, comprising 126 upregulated genes and 123 downregulated genes (Figure 2(a)). The MLH1 gene is a hallmark of dMMR and was significantly decreased in MSI cancer tissues in our ONT analysis (Figure 2(b)). Two hundred thirty-six overlapping genes with differential expression were identified when comparing cancer-MSI versus (vs.) cancer-MSS or cancer vs. cancer tissue (Figure 2(c)). Additionally, 107 overlapping genes were detected when comparing normal-MSI vs. normal-MSS or cancer vs. normal tissues (Figure 2(c)).
To assess the biological features of these differentially expressed genes, we employed GO analysis to feature the terms for MSI-related genes. In cancer tissues, the most significant GO terms for MSI-specific genes included GO:0006695 (cholesterol biosynthetic), GO:0008299 (isoprenoid biosynthetic), and GO:0098609 (cell-cell adhesion), which referred to biological process, molecular functions, and cellular component, respectively (Figure 2(d)). In normal tissues, the most significant GO terms were GO:0005886 (plasma membrane) and GO:0045892 (negative regulation of transcription, DNA-templated), which were classified as cellular component and biological process, respectively (Figure 2(f)). Additionally, both cancer and normal tissues MSI-related differential genes showed similar GO terms, and most of these genes were involved in metabolic-related processes, such as cholesterol biosynthetic and peptide catabolic process (Figures 2(e) and 2(g)).

Exploration of MSI-Related Alternative Splice (AS) Events.
To identify AS landscape, these transcripts determined using ONT sequencing were aligned to against known reference transcript annotations (hg38, UCSC) [30]. This analysis allows identification of the following 7 types of alternative splice events: (1) alternative 3 ′ splice site, (2) alternative 5 ′ splice site, (3) exon skipping, (4) exon skipping/mutually exclusive exon, (5) intron retention, (6) mutually exclusive exon, and (7) mutually exclusive exon/exon skipping. In total, ONT sequencing detected 6,200 AS events in 20 samples, which were found in 2,728 genes. The frequencies of AS were significantly higher in cancer tissues than in normal tissues, indicating that ASrelated variations play critical roles in regulating tumor progression (Figure 3(a)). Additionally, AS frequencies were significantly increased in MSI compared with MSS and revealed that AS might be the signature for dMMR (Figure 3(a)). Exon skipping was the most important AS event, accounting for nearly 60% of the total AS, and was significantly increased in cancer tissues (Figure 3(b)). However, alternative 5 ′ splice sites were significantly decreased in cancer tissues (Figure 3(b)). In total, 932 genes with AS events were significantly differentially    (Figure 3(c)). To screen AS events specifically expressed in MSI, we selected AS events that occurred in three or more of these 20 samples, as a result, 170 MSI-related AS transcripts were produced which were transcribed by 144 genes (Figure 3(d)). Among 144 genes, 68 genes were differentially expressed when comparing their levels between cancer and normal tissues (Figure 3(e)). Eleven ASedited genes had differential levels when compared their expression in MSI with their levels in MSS (Figures 3(e) and 3(f)). PPP1R14D, NOX1, and HMGCS2 were downregulated, while POSTN and TIMP1 were upregulated in MSI tissues (Figure 3(f)). According to UCSC analysis [31], both POSTN and TIMP1 had six transcripts and were transcribed through exon skipping, and NOX1 was encoded by four transcripts while transcribed by exon skipping. HMGCS2 and PPP1R14D were edited by exon skipping to transcribe two types of transcripts (Figure 3(g)).

Infiltrated Immune Cells in Tumor Microenvironment
Were Identified by EPIC and CIBERSORT Analysis. Estimating the proportion of immune and cancer (EPIC) and CIBERSORT are designed to identify infiltrated immune cells and cancer cells from bulk RNA sequencing. Both tools are used to assess the proportion of immune cells and cancer cells in ONT sequencing data. EPIC identified 8 types of cells in 20 samples ( Figure S4A), and immune cells, including CD8 T cells, NK cells, and B cells, were higher in normal tissues than in cancer tissues ( Figure S4B). By contrast, endothelial cells and macrophages were decreased in cancer tissues ( Figure S4B). Cancer-associated fibroblasts (CAFs) and NK cells were significantly increased in MSI ( Figure S4C). CIBERSORT recognized 22 cell types, which included several subtypes of immune cells. Plasma cells accounted for the highest proportion in our ONT data ( Figure S4D), and these cells were decreased in cancer tissues ( Figure S4E). Additionally, activated mast cells, activated/resting dendritic cells, activated NK cells, and naïve B cells were increased, while M2 macrophages and resting CD4 memory cells were decreased in cancer tissues ( Figure S4E). In CIBERSORT analysis, activated mast cells were abundant in MSI cancer ( Figure S4F). EPIC and CIBERSORT analysis indicated significantly differential immune cell patterns between cancer and normal tissues.

WGCNA Identifies MSI-Specific Clusters.
Weighted gene coexpression network analysis (WGCNA) was performed across all the sequencing samples to address the specific clusters associated with MSI. In total, these genes can be divided into 17 clusters, which are indicated with different colors in Figure 4(a). Three clusters showed differential expression in MSI that might represent signatures. Cluster 1 included 246 genes; among them, 20 overlapped genes were simultaneously shown to be differentially expressed in MSI and cancer vs. normal tissues (Figure 4(b)). INHBA, MMP1, and SERPINE1 comprised a protein-protein interacting (PPI) network according to STRING network analysis, and these genes were significantly increased in cancer tissues or MSI cancer tissues (Figure 4(b)). In cluster 2, HMGCS2, FABP1, and SELENBP1 were significantly decreased in MSI according ONT sequencing result, consistent with an interacting network (Figure 4(c)). Cluster 3 included several upregulated genes in MSI and failed to enrich any PPI network (Figure 4(d)). We further addressed the overall survival (OS) relevant to the TCGA dataset using the GEPIA tool. The results demonstrated that the overexpression of INHBA, TNFAIP6, and TIMP1 was associated with a lower OS rate (Figure 4(e)). By contrast, increases in HMGCS2 and SELENBP1 were associated with a higher OS rate (Figure 4(e)).   Disease Markers tissue and showed lower levels in MSI than in MSS. Additionally, HMGCS2 is an AS transcript that is not expressed in MSI and serves as a signature for MSI. Importantly, RPL22L1, INHBA, and HMGCS2 were involved in the I-SMAD binding biological process (Figure 2(e)). Therefore, we selected these four genes to further explore their differential expression using RT-qPCR in 23 paired cancer and normal tissues. As expected, the mRNA levels of RPL22L1, INHBA, and CAPZA1 were significantly higher in cancer than in normal tissues; conversely, HMGCS2 was significantly downregulated in cancer ( Figure 5(a)). Next, we questioned whether these genes could be signatures for MSI. We compared their fold changes between MSI and MSS, and INHBA showed significantly increased levels in MSI ( Figure 5(b)). In cancer tissues, RPL22L1 and INHBA were significantly upregulated in CRC with positive lymph nodes. In normal tissues, RPL22L1,

Disease Markers
INHBA, and CAPZA1 were found to be significantly increased in CRC patients for whom experienced tumor lymphatic metastasis ( Figure 5(c)).
Next, we applied IHC to determine the protein levels of INHBA, RPL22L1, and CAPZA1. Forty-nine cancer tissues and 47 normal tissues were enrolled for IHC analysis. Their

12
Disease Markers levels were scored using semiquantitative analysis, demonstrating that these three genes were significantly overexpressed in cancer tissues (Figures 6(a)-6(d)). IHC assays were consistent with our RT-qPCR analyses. We also addressed clinicopathological relevance and found that INHBA was significantly increased in patients diagnosed with lymph node metastasis tumors (Figure 6(e)). INHBA shown higher levels in male CRC patients than their levels in female (Table S2). Other clinicopathological factors included tumor size, differentiation, TNM stage, and vascular and nerve invasion; INHBA levels did not shown statistic differences (Tables S2). We inferred based on both mRNA and protein that INHBA expression can be used as a biomarker to predict tumor lymphatic metastasis. Seven patients who were enrolled in the IHC analysis were diagnosed with MSI. We compared the expression of INHBA in these patients and found significantly higher levels in patients with MSI than in MSS patients ( Figure 6(f)).

INHBA Is a Signature for MSI Diagnosis in Blood Plasma.
Because INHBA is a protein secreted into the extracellular space by tumor cells, we applied ELISA to determine its plasma levels. Our analysis enrolled 20 plasma samples from patients diagnosed with nontumor benign disease and 66 plasma samples from CRC patients. As expected, the CRC plasma levels of INHBA were significantly increased compared with those of samples from nontumor benign disease patients (Figure 7(a)). Receiver operating characteristic (ROC) curve analysis was performed to examine the diagnostic efficiency when INHBA was recognized as a biomarker. Area under the curve (AUC) analysis indicated that secreted INHBA was a

Discussion
In the present study, we constructed CRC mRNA profiles using ONT long read sequencing, which has a unique advantage in identifying the AS of mRNA transcripts. dMMR/MSI tumors exhibit a high frequency of AS, which is associated with a large burden of destabilizing mutations [32]. Utilizing integrative bioinformatic analysis, we demonstrated that immune chemotactic genes are the most differentially expressed in cancer tissues and mediate the recruitment of immune cells [33]. Both B and T cells are the major class of immune cells in the adaptive immune system and have multiple functions in facilitating the coevolution of cancer and its microenvironment [34]. Our ONT data consistently indicated that these cells are decreased in cancer tissues. Innate immune cells, including NK cells, macrophages, and mast cells, were dysregulated and showed differential levels in MSI-related CRC. These findings are associated with the characteristics of innate cells, which infiltrate the microenvironment and display tumor phagocytosis or promote tumor growth [35].
Despite extensive investigation of genomic mutations and many genetic drivers, few studies have focused on AS in dMMR/MSI [36]. Here, despite using a small cohort, our ONT sequencing determined thousands of differentially expressed genes and AS events, which showed a higher efficacy than next-generation sequencing. Tumor mutational burdens predict the response efficiency to ICB treatment in dMMR/MSI tumors [37]. In our study, AS of mRNA transcripts was significantly correlated with dMMR/MSI and we screened several potential biomarkers to validate in CRC tissues or plasma. INHBA, RPL22L1, CAPZA1, and HMGCS2 showed significant differential levels in dMMR/ MSI cancer tissues and were finally selected for further detection. RPL22L1, a ribosomal protein, is overexpressed in ovarian cancer and promotes cancer cell metastasis via the epithelial-to-mesenchymal transition [38]. Consistently, we validated its overexpression in CRC and found that it was associated with dMMR/MSI or tumor lymphatic metastasis. CAPZA1 regulates actin polymerization, which controls cell motility by binding to the barbed ends of actin filaments [39]. CAPZA1 is upregulated in hepatocellular carcinoma [40] and gastric cancer [39]. HMGCS2 is downregulated in CRC, serving as a critical downstream target of SLC38A4 to regulate tumor cell metabolism [41].
INHBA is a susceptibility gene for pancreatic cancer [42] that is involved in the TGF-β signaling pathway and was edited by AS in our ONT analysis. INHBA is associated with the CRC OS rate and plays a role in promoting tumor cell proliferation and migration [43]. We demonstrated that INHBA is significantly increased in CRC cancer cells and, more importantly, is a potential biomarker for discerning dMMR/MSI candidates.
The most important concept conveyed in this study is a comprehensive investigation of mRNA profiles using thirdgeneration sequencing, which allows us to explore AS events related to CRC dMMR/MSI for the first time. These newly constructed ONT data identified multiple genes that could be applied as markers associated with CRC progression. We surveyed their potential clinical applications in small sample sizes, such as INHBA, a promising dMMR/MSI biomarker. Two major limitations of our study are its retrospective nature and single-center design. Further study is required to address the clinical relevance of INHBA or other identified genes as markers for MSI-H/dMMR solid tumors in a larger cohort.

Data Availability
The data were submitted to the NCBI SRA database and are accessible through the citation accession number PRJNA758110.

Ethical Approval
The present study was approved by institutional review board at Beijing Chao-Yang Hospital.

Consent
All the enrolled subjects signed written informed consent for the use of their clinical specimens.