Neoantigens Derived from Recurrently Mutated Genes as Potential Immunotherapy Targets for Gastric Cancer

Neoantigens are optimal tumor-specific targets for T-cell based immunotherapy, especially for patients with “undruggable” mutated driver genes. T-cell immunotherapy can be a “universal” treatment for HLA genotype patients sharing same oncogenic mutations. To identify potential neoantigens for therapy in gastric cancer, 32 gastric cancer patients were enrolled in our study. Whole exome sequencing data from these patients was processed by TSNAD software to detect cancer somatic mutations and predict neoantigens. The somatic mutations between different patients suggested a high interpatient heterogeneity. C>A and C>T substitutions are common, suggesting an active nucleotide excision repair. The number of predicted neoantigens was significantly higher in patients at stage T1a compared to in patients at T2 or T4b. Six genes (PIK3CA, FAT4, BRCA2, GNAQ, LRP1B, and PREX2) were found as recurrently mutated driver genes in our study. Combining with highly frequent HLA alleles, several neoantigens derived from six recurrently mutated genes were considered as potential targets for further immunotherapy.


Introduction
Since the approval of trastuzumab as a treatment for HERpositive breast cancer in 1998 by the FDA [1], tumorassociated antigens such as CD molecules, VEGF, and EGFR have been actively targeted for drug development by the pharmaceutical industry [2][3][4]. The side effects of therapies based on monoclonal antibodies are mild and tolerable. However, when coupled with antibody-drug conjugates (ADC) or the chimeric antigen receptor T-cells (CAR-T) technology, the nonspecific and durable off-target cytotoxicity can be fatal for patients [5]. Therefore, the development of an optimal tumor-specific target that could differentiate tumor cells from normal tissues is essential.
Several studies have shown that targeting neoantigens in T-cell-based immunotherapy is a promising approach for treatment of lung adenocarcinomas [6], leukemia [7], and melanoma [8,9]. Cancer is initialized by somatic driver mutations and other genetic instabilities, which are the molecular basis of the carcinogenesis process. In particular, point mutations are directly involved in essential cellular activities and functions, such as proliferation, apoptosis, and tumorigenesis. Mutant proteins are also processed by the intracellular repair system through ubiquitination and hydrolysis in the proteasome. Hydrolyzed peptides (length of 8-11 amino acids) are bonded with class I major histocompatibility complex (MHC) molecules and are presented on the cell 2 BioMed Research International surface as tumor-specific neoantigens, which are recognized by T-cells, provoking an immune response.
Gastric cancer (GC) is the third leading cause of cancer mortality in world. It is a common cancer prevalent in Eastern Asia, Central and Eastern Europe, and South America. The prognosis remains poor with a 5-year overall survival rate at 30.4% [10,11]. Besides traditional chemotherapy agents, only trastuzumab, ramucirumab, and apatinib have been approved for advanced or metastatic GC. Systematic molecular profiling of GC on 595 patients by the Cancer Genome Atlas (TCGA) [12] and Asian Cancer Research group (ACRG) [13] shows that CG are highly heterogenous, exhibiting high chromosomal instability, hypermethylation, and mutation burden. Based on its molecular characteristics, the identification of neoantigens against recurrently mutated oncogenes is feasible, using current next-generation sequencing (NGS) platforms and bioinformatic analysis pipeline.
Previous studies have used genomic data from the TCGA, Foundation Medicine Adult Cancer Clinical Dataset (FM-AD), and their own cohorts to characterize neoantigens and their association with genetic alteration or with survival [14][15][16][17]. However, these studies did not focus on neoantigen profiling for gastric cancer patients. We analyzed the characteristics of somatic mutations and neoantigens, especially their correlation with clinical features of patients. The important neoantigens and their associated oncogenes shared by several patients were chosen with the goal of further developing Tcell-based immunotherapy such as vaccines for patients. The work presented here collected tumor tissues and peripheral blood samples from 32 gastric cancer patients. The whole exome sequencing was performed on Illumina Hiseq4000 sequencing system. An in-house developed integrated software "Tumor-Specific Neo-Antigen Detector" (TSNAD) [18] was used to predict neoantigens.

Materials and Methods
. . Patients. Fresh or FFPE-embedded primary tumor tissues and paired peripheral blood were collected from 32 gastric cancer patients during the period from August 12, 2016, to March 14, 2017. Among the 32 gastric patients, 11 were female patients and 4 were below 45 years of age. Of these, 2 were T1a, 6 were T2, 6 were T4a, and 18 were T4b cases, respectively. Detailed information of these samples is listed in Table 1. The enrolment of human subjects in this study was done after informed consent forms were signed. Written consent for the collection and use of tissues for research purposes has been obtained, with ethical approval from Research Ethics Committee of the First Affiliated Hospital, Zhejiang University School of Medicine, China. All methods reported in our study were performed in accordance with the relevant guidelines and regulations.
. . Whole Exome Sequencing. DNA was extracted from the tumor tissues and peripheral blood using AxyPrep Blood Genomic DNA Kit and AxyPrep Multisource Genome DNA Kit. Exomes were captured from 750 ng of genomic DNA per sample using the Agilent SureSelect Human All Exon V5 Kit (Agilent Technologies) according to the manufacturer's instructions. Paired-end multiplex sequencing was then performed on the Illumina HiSeq4000 sequencing platform. On average, the sequencing depth was 86× per sample with standard deviation ± 46×.
. . Pipeline for Somatic Mutation Analysis, HLA Genotyping, and Neoantigen Prediction. The raw data was processed by integrated software TSNAD (available on http://github.com/ jiujiezz/TSNAD) [18]. This software was developed by our laboratory with a graphical user interface, which combines the necessary algorithms to identify cancer somatic mutations, determine HLA genotyping, and predict neoantigens. TSNAD can identify cancer somatic mutations following the best practices of the genome analysis toolkit (GATK) from the genome/exome sequencing data of tumor-normal pairs and also determine HLA genotyping by SOAP-HLA [19]. Then, TSNAD invokes NetMHCpan [20] to predict neoantigens which can bind to class I MHC molecules. Besides, TSNAD can also identify germline mutations.
. . Statistical Analysis. The statistical analyses were performed using R software and Wilcoxon rank-sum test was used to determine the significance. The significance was defined when p<0.05.

Results and Discussion
. . Intrapatient Heterogeneity of Somatic Mutations and Neoantigen Number. The somatic mutation analysis and neoantigen prediction of 32 gastric cancer patients were directly performed by TSNAD from whole exome sequencing raw data. The numbers of nonsynonymous mutations, indels, and neoantigens were listed in Table 1. In total 7,432 somatic missense mutations, 658 indels, and 12,929 neoantigens were filtered by the software for 32 patients. The median number is 138 for missense mutation (median tumor mutation burden, median TMB = 4.6 mutations/Mb), 14 for indels, and 202 for neoantigens (see Figures 1(a) and 1(c)). The fitting curve revealed a laniary positive correlation between missense number and predicted neoantigens (R 2 =0.8845, see Figure 1(b)). According to three gastric cancer projects in ICGC Project (GACA-CN, GACA-JP, and STAD-US), the TMB of Chinese gastric cancer patients (median TMB = 8.467 mutations/Mb) is greater than the Japanese (median TMB = 6.467 mutations/Mb) and American (median TMB = 5.3 mutations/Mb). The difference in the tumor mutation burden between our cohort and GACA-CN is caused by the high heterogeneity of gastric cancer and the limited small sample size of our cohort.

. . C>A and C>T Substitutions Are Major Mutation Types in
Gastric Cancer. We have observed an average of 232 nonsynonymous mutations in GC. In comparison with a study from Bi et al., we found that nonsynonymous mutation counts in GC were significantly higher than meningioma, thyroid cancer, pituitary adenoma, craniopharyngioma, breast cancer, and glioblastoma [21]. Our result was also in accordance with mutational signatures generated by Alexandrov et al. [22,23].
We analyzed nucleotide substitution types of 7,432 missense mutations and found that 60.47% of missense mutations are transversions and 39.53% of substitutions are transitions. Individual types of substitution were presented at the bottom of Figure 2. On average, the percentage of C>A type is 32.18%, 27.24% for C>T, 12.51% for T>G, 12.29% for T>C, 9.89% for C>G, and 5.89% for T>A. C>A and C>T became the major substitution types in missense somatic mutations.
COSMIC has provided a set of 30 mutational signatures based on a large-scale analysis across nearly 40 human cancer types. 11 of 30 mutational signatures are reported related to gastric cancer. The dominant prevalence of C>A and C>T suggested a hyperactive deamination and transcribed strand bias during transcription-coupled nucleotide excision repair in gastric cancer.
. . Twelve Recurrently Mutated Genes in Gastric Cancer. The total 7,432 missense mutations were distributed in 4,451 genes. Firstly, we filtered the genes that had been discovered to be mutated at least in three patients. The number of genes decreased to 232 (see Additional file 1: Table S1). The upper part of Figure 2 presented a heat map of 232 mutant genes distributed in 32 patients. The most recurrently mutated gene was MUC with an occurrence of 93.75% in this study and more than 8 nonsynonymous mutations per patient. Mucin 4 is an integral membrane glycoprotein. As major constituents of mucus, Mucin 4 plays important roles in the protection of epithelial cells in the colon, cervix, and trachea. Silencing or reduced expression of MUC is associated with proliferation of pancreatic carcinoma cell line [24] and poor prognosis in renal cell carcinoma and breast carcinogenesis [25,26]. However, MUC has not been yet considered as a cancerrelated gene in COSMIC. We then matched these 232 recurrent genes to the Cancer Gene Census; only 12 genes were filtered as essential tumor- activity, PREX interacts with PIK CA signaling pathway. In the four gastric cancer subtypes classified by TCGA, PIK CA mutations occur at a frequency between 3% and 42% [12]. TP , NOTCH , FAT , CDH , and BRCA are tumor suppressor genes (TSG). TP somatic mutations were observed in 71% of chromosomal instability (CIN) subtypes and CDH mutations were enriched in 37% of genomically stable (GS) subtype [12].

. . Neoantigen Profiling of
Gastric Cancer Patients Revealed Significant Differences between Stages. We tried to study the feature of neoantigens' number associated with  patients' clinical characteristics (see Figure 3). Our study enrolled 11 female and 21 male patients; the median number of neoantigens for male patients was higher than that for female patients. The patients below 45 years of age (n=4) had less predicted neoantigens than those elder than 45 years (n=28). However, the difference was not statistically significant. GC patients are usually diagnosed at more advanced stages of cancer progression. Of the 32 patients studied, 23 were diagnosed at T4 stage. Only 2 patients were diagnosed at T1 stage and 7 patients at T2 stage. Unexpectedly, the number of neoantigens was significantly higher in T1a than in T2 (p=0.02) and T4b (p=0.03) respectively. The reason why neoantigens are less common at later stages might be due to the enrichment of dominant malignant subclones as the tumor progressed. We continued to enroll early-stage patients to see that this pattern is statistically significant.

. . Six Recurrently Mutated Genes Encoding Neoantigens
Predicted to Be Potential Targets against GC. Based on the HLA genotype (see Additional file 2: Table S2) [27].
From 12,929 predicted neoantigens, we focused on the recurrently mutated genes encoding neoantigens that were present in at least 3 patients. 54 genes were filtered, and a heat map was made to visualize the distribution (see Figure 4). Then we matched 54 genes to the cancer Gene Census. PIK CA, FAT , BRCA , GNAQ, LRP B, and PREX were found as genes highly associated with cancer development.
The detailed mutations, HLA alleles, and sequences of 139 neoantigens derived from these six mutated genes were listed in Table S3 (see Additional file 3). The missense mutations found in PIK CA, FAT , BRCA , LRP B, and PREX were not recurrent in our study and the sequences of predicted neoantigens were unique. The amino acid change of T96S in GNAQ was the only same mutation found in three patients and the mutant peptides were predicted to bind to HLA-A * 02:01, HLA-A * 03:01, and HLA-A * 11:01 alleles showing a strong binding ability (affinity IC50 <100 nM). Unfortunately, the predicted neoantigen sequences for T96S in GNAQ were not identical for patients because of different individual HLA-A genotype.
Due to the small size of enrolment, we used the mutation frequency published by the ICGC Project as a reference. We found that mutations S37L (1/12,198) and N289H (3/12,198) in BRCA , Q453L (5/12,198) and A807V (6/12,198) in FAT , T96S (10/12,198) in GNAQ, and H1047Y (8/12,198) and V344M (5/12,198) in PIK CA have been already reported by the ICGC dataset (see Table 2) despite a very low frequency of occurrence. PIK CA and its signaling pathway have been widely studied in many cancers. H1047 is a hotspot mutation in PIK CA. Alternation from histidine to arginine has 281 recurrences of 12,198 donors in ICGC database. Our patient S0616102601 had the same position mutation but with an alteration from histidine to tyrosine. Only 1 donor from the ICGC gastric cancer dataset exhibited an identical mutation. In addition, we found two germline mutations in BRCA (A2466V and N372H). In ClinVar, A2466V and N372H are considered as benign mutations in familial breast cancer. N372H is also annotated as a variant of unknown significance in Online Mendelian Inheritance in Man (OMIM). The development of effective therapies against cancer has been a longstanding goal for many decades. Improvements in anticancer therapies rely on deeper understanding of the oncology and molecular basis of cancer. Recent positive developments in immunotherapy targeting neoantigens show promise as an effective method to treat cancer.
Neoantigens-associated immunotherapy could be a feasible strategy for treating malignancy with somatic mutations in driver genes encoding intracellular protein such as KRAS. KRAS has been considered as "undruggable" because of lack of binding pocket for the past 30 years. The famous amino acid alternation of Gly12 (G12V, G12C, or G12D) of KRAS protein is involved in 60-70% of pancreatic cancers and 20-30% of colorectal cancers [28]. Work from Rosenberg and Tran has reported impressive response from G12D-positive metastatic colorectal cancer patients when treated with autologous T-cell therapy using the HLA-C * 08:02 allele [29]. Of the 7 patients treated, 6 patients were reported to respond positively and are currently in remission. Mutations on KRAS are also an important cause for EGFR-targeted tyrosine kinase inhibitor (TKI) drug resistance. Moreover, the mutant peptides derived from V600E in BRAF [30,31] or T790M in EGFR [32] were reported as binding with HLA-A * 02:01 and presented as neoantigens. Therefore, neoantigens-associated immunotherapy could be employed to treat cancers exhibiting common drug resistance mutations.
In the top 20 recurrently mutated genes of three ICGC projects (GACA-CN, GACA-JP, and STAD-US), seven genes were shared between Chinese and Japanese gastric cancer   [33]. Chen found that NRG was mutated in 10% of 78 GC patients, and 8% of patients with mutations in BRCA were associated with longer survival [34]. Aberrant methylation of LRP B [35] and mutations on LDL receptor-related protein 1B cause SMAD -induced GC growth [36]. PREX is involved in PIK CA-PTEN-AKT signaling pathway. The frequency of GNAQ mutations was higher in intestinal-type gastric cancer [37]. Suppression of NOTCH signaling pathway could induce GC progression, drug resistance, and metastasis [38,39]. The function of PCM and USP in gastric cancer remains exploitable. Recurrent oncogenic mutations such as S37L and N289H in BRCA , Q453L and A807V in FAT , T96S in GNAQ, and H1047Y and V344M in PIK CA were predicted to bind with HLA-A02:01, A03:01, A11:01, B15:01, B15:02, B58:01, B40:01, B39:01, and C03:02. Unlike the distribution percentage in African or Caucasian population, A02:01, A11:01, B58:01, B40:01, B15:01, and C03:02 HLA alleles are present in more than 5% of the Han Chinese population. The possibility to discover a common neoantigen target predicted by identical HLA allele and oncogenic mutation in Chinese patients is quite higher than African or Caucasian population.
While it is unknown why some individuals fail to respond favourably from T-cell-based neoantigen immunotherapy, targeting recurrent mutations of driver genes with HLA alleles still represents a promising avenue to treat eligible patients.
In this study, we analyzed the clinical features of Chinese GC patients and paired them with the somatic mutations and neoantigens present. We chose some reoccurring neoantigens and their associated oncogenes shared by several patients for the continued development of T-cell-based immunotherapy, such as vaccines. Due to the limited sample size of our study, we included the mutation frequencies from ICGC Project in our analysis to determine recurrent oncogenic mutations. Further studies should be conducted to confirm these results.

Conclusions
In conclusion, twelve recurrently mutated driver genes were identified in our study to further understand the mechanism of GC development. To identify the "druggable" targets, neoantigen profiling by TSNAD was done, highlighting several recurrent oncogenic driver mutations. Mutant peptides encoded by seven recurrent oncogenic mutations were predicted to bind with high frequency HLA alleles as tumorspecific neoantigens. These neoantigens are currently undergoing further experimental validation as potential targets for autologous T-cell immunotherapy to treat GC patients.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Ethical Approval
Ethical approval was obtained from Research Ethics Committee of the First Affiliated Hospital, Zhejiang University School of Medicine, China. All experimental procedures were performed in accordance with the relevant guidelines (Approved Guidelines of the Clinical and Laboratory Standards Institute MM01-A3, MM13-A, and MM20-A).

Consent
The enrolment of human subjects in this study was done after informed consent forms were signed. Written consent for the collection and use of tissues for research purposes has been obtained.