EGFR Polymorphism and Survival of NSCLC Patients Treated with TKIs: A Systematic Review and Meta-Analysis

Tyrosine kinase inhibitor- (TKI-) based therapy revolutionized the overall survival and the quality of life in non-small-cell lung cancer (NSCLC) patients that have epidermal growth factor receptor (EGFR) mutations. However, EGFR is a highly polymorphic and mutation-prone gene, with over 1200 single nucleotide polymorphisms (SNPs). Since the role of EFGR polymorphism on the treatment outcome is still a matter of debate, this research analyzed the available literature data, according to the PRISMA guidelines for meta-analyses. Research includes PubMed, Scopus, ISI Web of Science, and 14 of genome-wide association studies (GWAS) electronic databases in order to provide quantitative assessment of the association between ten investigated EGFR SNPs and the survival of NSCLC patients. The pooled HR and their 95% CI for OS and PFS for different EGFR polymorphisms using a random or fixed effect model based on the calculated heterogeneity between the studies was applied. The longest and the shortest median OSs were reported for the homozygous wild genotype and a variant allele carriers for rs712829 (-216G>T), respectively. Quantitative synthesis in our study shows that out of ten investigated EGFR SNPs (rs11543848, rs11568315, rs11977388, rs2075102, rs2227983, rs2293347, rs4947492, rs712829, rs712830, and rs7809028), only four, namely, rs712829 (-216G>T), rs11568315 (CA repeat), rs2293347 (D994D), and rs4947492, have been reported to affect the outcome of TKI-based NSCLC treatment. Of these, only -216G>T and variable CA repeat polymorphisms have been confirmed by meta-analysis of available data to significantly affect OS and PFS in gefitinib- or erlotinib-treated NSCLC patients.


Introduction
For the past several decades, lung cancer remains one of the major causes of mortality worldwide [1][2][3]. According to the World Health Organization, it is the most commonly diagnosed cancer and the leading cause of cancer death, with over 2 million of new cases and more than 1.7 million deaths in 2018 [4,5]. Of those, over 85% is due to non-small-cell lung cancer (NSCLC), which exhibits better prognosis than its complement, i.e., small cell lung cancer [1], yet displays low long-term survival and reduced quality of life [6,7]. Although cigarette smoking represents the primary risk factor for NSCLC development [8], numerous investigations confirmed that genetics plays one of the leading roles in the process [9][10][11]. Gene variations that have been identified as conferring higher risk of NSCLC could be either germline or somatic, with some of the most common lung cancer-related driver mutations linked to epidermal growth factor receptor gene (EGFR) [12].
EGFR is a transmembrane tyrosine kinase receptor that, upon activation, becomes a transducer of signals for cell proliferation [13]. EGFR overexpression, often due to genetic alterations, has been firmly and consistently associated with carcinogenesis [13][14][15], and EGFR itself recognized as a potential target of an important therapeutic approach to cancer. Namely, it has been observed that drugs that inhibit tyrosine kinases, enzymes important for tumor cell proliferation, growth, and metastasis, display target-specific antitumor activity against different types of malignancies, including lung, breast, colorectal, and prostate cancer [16]. Since the discovery of gefitinib, the first tyrosine kinase inhibitor (TKI) aimed EGFR [17], several similar drugs have been approved for the treatment of NSCLC, including erlotinib [18,19]. Compared with chemotherapy as a former treatment of choice, TKI-based therapy revolutionized the overall survival and the quality of life of NSCLC patients, especially if they are carriers of the EGFR driver mutations [20][21][22][23]. Still, for the majority of patients, the prognosis of NSCLC remains unfavorable, mainly as a consequence of either intrinsic or acquired resistance to TKI. While acquired resistance develops during the treatment, mostly due to occurrence of secondary EGFR mutations, intrinsic resistance usually implies the presence of inherited variations, including EGFR single-nucleotide polymorphisms (SNPs) [24][25][26][27].
EGFR is highly polymorphic and mutation-prone gene, with over 1200 SNPs [28] and over 2700 mutations [29] described so far. EGFR mutations have been extensively studied in relation to NSCLC, and some of them, including alterations in the tyrosine kinase domain, were clearly associated with better response to TKI-based therapy [30]. Yet, the role of EFGR polymorphism on the treatment outcome is still a matter of debate, as published research studies offer inconsistent results [31,32], and available meta-analyses lack the comprehensiveness in terms of included SNPs [25,33]. erefore, the aim of our study was to review and analyze the available literature on TKI-based therapy, in order to provide quantitative assessment of the association between EGFR polymorphism and the survival of NSCLC patients.

Literature Search and Study Selection.
To identify the studies on the association between EGFR polymorphisms and the survival in NSCLC patients treated with TKI therapy, a systematic search of the available literature according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines for metaanalyses and systematic reviews was performed [34] ). Other two databases, i.e., Scopus and ISI Web of Science, were searched using the appropriately modified initial PubMed search query (details are available upon request). In addition, detailed search of several publically available databases of genome-wide association studies (GWAS) was carried out, including the GWAS Central [38], the Genetic Associations and Mechanisms in Oncology (GAME-ON) [39], the Human Genome Epidemiology (HuGE) Navigator [40], the National Human Genome Research Institute (NHGRI GWAS Catalog) [41], the database of Genotypes and Phenotypes (dbGaP) [42], the GWASdb [43], the Italian Genome-Wide Database (IGDB) [44], and the GRASP: Genome-Wide Repository of Associations between SNPs and Phenotypes [45]. Finally, we separately searched the bibliographies of eligible studies to look for additional studies. We considered studies published until February 09, 2018 and written in English, Italian, or Russian.
Studies were considered eligible if they assessed the association between the EGFR polymorphism in NSCLC patients treated with EGFR-TKI, and survival, expressed as the progression-free survival (PFS), time to progression (TTP), or the overall survival (OS). PFS was defined as the time from the first day of EGFR-TKI treatment until tumor progression or death from any cause while censoring the patients that were lost to follow-up [46]. Time from initiating the therapy until the disease progression as the event of interest was considered as the TTP [47]. Finally, OS was defined as the period from the first day of EGFR-TKI therapy to the date of death or final follow-up, whichever arrived first [47]. Hazard ratios (HRs) with corresponding 95% CIs were used to evaluate the quantitative aggregation of the survival for different genotypes of EGFR.
After all potentially eligible studies were collected, crosslinking of the studies from different electronic databases was performed in order to remove duplicates. Two reviewers (J.O. and V.V.) independently screened the titles and abstracts of the relevant articles, and any disagreement was resolved through discussion. Full texts of the potentially eligible studies were subsequently retrieved and assessed for final inclusion, according to the reported criteria. Namely, we only included studies conducted on patients with histopathologically confirmed NSCLC, who received EGFR-TKIs based therapy and where the measures of outcome were reported according to the EGFR genotype. On the other hand, reviews, meta-analyses, editorials, case reports, and studies conducted only on cell lines were excluded. When there were multiple publications on the same or overlapping study population, we only included the most comprehensive one. e same investigators evaluated the methodological quality of included studies using the widely accepted Newcastle-Ottawa Quality Assessment Scale (NOS) for cohort studies [48] and the Jadad Scale for the randomized control trials (RCTs) [49].

Data Extraction and Quality
e NOS for cohort studies evaluates three perspectives of the methodological quality: the selection of the study groups (four points); the comparability of the groups (two points); and the ascertainment of exposure or outcome of interest for cohort studies (three points) and assigns a total of maximum 9 points. e Jadad scale for reporting randomized controlled trials evaluates the risk of bias in three domains: randomization, double blinding, and description of withdrawals and dropouts with a final score from 0 to 5. Any disagreements between the reviewers were resolved through discussion or in consultation with other authors.

Statistical Analysis.
Meta-analysis was conducted when at least two studies on the same genetic variant were available. We calculated the pooled HR and their 95% CI for OS and PFS for different EGFR polymorphisms using a random or fixed effect model based on the calculated heterogeneity between the studies [50]. e χ2-based Q statistics and the I 2 statistics [51] were used to evaluate the between study heterogeneity, with I 2 � 0% indicating no observed heterogeneity, 25% regarded as low, 50% as moderate, and 75% as high [52]. When Q test or the I 2 test indicated significant heterogeneity between the studies (p < 0.10, I 2 >50%), the random-effect model was used, otherwise, the fixed-effect model was applied. Additionally, Galbraith's plot was constructed to explore the weight each study had on the overall estimate and the contribution to the Q statistics for heterogeneity [53].
We also performed a one-way sensitivity analysis to check stability of the results. To assess the publication bias (where appropriate), we conducted Egger's asymmetry test (level of significance p < 0.05) [54]. All statistical analyses were performed using the STATA software package v.15 (STATA Corporation, College 162 Station, TX, USA), and statistical significance was set at p < 0.05.

Search Results and Study Characteristics.
Of 5467 records obtained through the screening of PubMed, ISI WOS, Scopus, and 14 GWAS databases, 3699 remained after removing the duplicates. After reading the titles and abstracts, 42 full text articles were assessed for the inclusion. We further excluded 33 papers for not fulfilling the inclusion criteria, leaving 9 studies as eligible. After inspection of references of the included studies, we additionally identified two studies, arriving to the 11 studies to be finally included in the review. Ultimately, 5 studies were incorporated in the quantitative synthesis for OS, and 4 were considered for PFS ( Figure 1).
Our search results consisted of 10 cohort studies [25,26,31,32,[55][56][57][58][59][60] and 1 randomized controlled trial (RCT) [61], conducted in high-income Western countries and in Asia. Overall quality of the included study was good, with two studies [56,60] scoring maximum point on the NOS. Highest scores were demonstrated for most of the evaluated domains, while the domain of follow up adequacy was with the lowest score (Supplementary Table 1). Sample size varied from 62 to 760 patients while median age from 55.2 to 67.0 years. Majority of patients were in clinical stages III and IV. Reported medium follow-up time ranged from 11.4 months up to 62.7 months, and TKIs used in the studies were gefitinib and erlotinib. Detailed description of the included studies is presented in Table 1.
All included studies investigated the OS and reported their findings for 10 different EGFR SNPs, namely, rs11543848, rs11568315, rs11977388, rs2075102, rs2227983, rs2293347, rs4947492, rs712829, rs712830, and rs7809028. e longest and the shortest median OS were reported for the homozygous wild genotype and a variant allele carriers for rs712829 (-216G>T), respectively [25]. Increased HR was observed in patients lacking wild-type (CA)16 allele (rs11568315) as compared with carriers of at least one (CA) 16 allele [58]. On the other hand, lower HR (0.29; 95% CI: 0.10-0.83) indicating better prognosis was reported in homozygous carriers of less common g.106268G allele (rs4947492) [31], in carriers of at least one variant -216T (rs712829) allele (HR � 0.67; 95% CI: 0.48-0.94) [61], as well as in carriers of lower number of CA repeats (HR � 0.43; 95% CI: 0.23-0.78) [59], as compared with their corresponding genotypes. Interestingly, carriers of one or both variant alleles, as compared with homozygous wild genotype for 181946C>T (rs2293347), displayed increased HR according to one [31] and decreased HR according to other [32] investigated study. e details of OS, HR, and RR for each investigated SNP across the included studies are presented in Table 2.
Of all included studies, six [25,26,32,56,59,61] reported PFS in relation to five different SNPs, with only one of them reporting the median PFS time [25]. On the other hand, TTP was reported in only two studies [55,57] and were stratified according to four different SNPs (Table 3). Based on the PFS reports, better prognosis was associated with rs11568315, rs2293347, and rs712829 polymorphisms, i.e., with the presence of lower number of CA repeats [55] and variant 181946T [32] and -216T [56] alleles. In addition, lower number of CA repeats (rs11568315) was also associated with better TTP [55].

Quantitative Synthesis.
Five studies [31,32,56,59,61] reported enough information about OS to be included in our meta-analysis, and the forest plot with pooled HR and their 95% CI of OS available for four SNPs, namely, rs11568315, rs712830, rs712829, and rs712830, is presented in Figure 2. Due to significant heterogeneity between the studies, the random effect model was applied. Egger test and Begg's correlation method demonstrated no evidence of publication bias. Our analysis revealed rs712829 (HR � 0.80, 95% CI: 0.67-0.96, p � 0.01; heterogeneity I 2 � 0%, p � 0.37) and rs11568315 (HR � 0.56, 95% CI: 0.32-0.99, p � 0.046; heterogeneity I 2 � 51.9%, p � 0.15) polymorphisms, more precisely the presence of at least one -216T variant allele and the presence of ≤16CA repeats, respectively, as the only positive prognostic factors for the OS, with no observed heterogeneity ( Figure 2). e Egger test demonstrated no statistical evidence of publication bias for rs712829 and rs712830 (p � 0.48 and p � 0.6, respectively). As Galbraith's plot, performed to explore the potential sources of heterogeneity, identified the study of Winther-Larsen et al. [32], this study was omitted in one-way sensitivity analysis of rs712829. Yet, the overall HR remained significant and was only slightly changed to 0.69 (95% CI: 0.52-0.91, p < 0.008; heterogeneity I 2 � 0%, p � 0.78). Other investigated SNPs demonstrated increased pooled HR but without statistical significance.
Four studies [32,56,59,61] were included in the pooled analysis of PFS in patients stratified according to genotyping data available for three EGFR SNPs, i.e., rs11568315, rs712829, and rs712830. As there was no significant heterogeneity between the studies, the fixed effect model was applied. No significant publication bias was demonstrated by Egger tests (rs712829 and rs712830, p � 0.19 and p � 0.08, respectively) even though these tests for exploring the publication bias are underpowered with only few studies included. Again, the only significant factors, indicating better prognosis in NSCLC treated with TKIs, were the presence of at least one -216T variant allele (HR � 0.81, 95% CI: 0.68-0.96, p � 0.02; heterogeneity I 2 � 0%, p � 0.37) and    (Figure 3). e Galbraith plot identified the study of Winther-Larsen et al. [32], thus we omitted this study in the one-way sensitivity analysis for rs712829. e results confirmed statistical significance of rs712829-related HR, which was only slightly

Discussion
e discovery of activating mutations in the EGFR gene from fifteen years ago represented a major breakthrough in the    treatment of NSCLC [22,23], as clinical responsiveness to TKIs, promising new treatment alternatives [17], turned out to be highly dependent on the presence of so-called "sensitizing" EGFR mutations [62]. Consequently, both of the first-generation TKIs, gefitinib and erlotinib, have been approved for the treatment of patients with metastatic NSCLC, but only if their tumors harbor EGFR exon 19 deletions or exon 21 (L858R) substitution mutations [63]. Nevertheless, neither EGFR mutation testing nor full TKI response is easy to achieve, as former postulates availability of samples of biopsied/resected tumor tissue or pleural effusion and appropriate methodology, expertise and equipment [64], while later is undermined by intrinsic or acquired resistance to TKI that exists or develops in majority of patients [24][25][26].
In overcoming these issues, numerous studies have been performed to disclose other important factors involved in response to TKI, aiming for those which could be more easily detected and also already present at the beginning of the treatment, hence useful as potential prediction markers for TKI-based therapy outcome. Significant research load has been focused on EGFR as the therapy target, revealing that certain germline variants of the EGFR gene could confer altered prognosis in their NSCLC-diagnosed carriers treated with TKI [31,32,55,56,58,61]. However, the studies were either underpowered [25,55,56] or yielded conflicting results [26,32,58,61], leaving the possibility of EGFR SNP-associated role in clinical responsiveness to TKIs insufficiently explored. To assess, consolidate, and integrate the available knowledge on this subject, we performed systematic review and meta-analysis of published reports on association between EGFR polymorphism and the survival of NSCLC patients. Of 10 EFGR SNPs evaluated in our study, four were reported to affect the response to TKI, namely, rs712829 (-216G>T) [56,61], rs11568315 (CA repeat) [55,56,58,59], rs2293347 (D994D) [31,32], and rs4947492 [31]. However, pooled analysis of the available data revealed that only EGFR -216G>T and variable CA repeat polymorphisms significantly affect the prognosis of TKI-treated NSCLC patients, with longer OS and PFS associated with the presence of variant -216T allele and ≤16CA repeats. e 5′-flanking region of the EGFR gene acts as a promoter by binding Sp1 transcription factor [65]. EGFR-216G>T SNP is located in one of the Sp1 binding sites, thus affecting initiation of the EGFR transcription [66]. Namely, it has been discovered that the replacement of G by T at this position increases the promoter activity and gene expression by 30% and 40%, respectively [67]. Furthermore, this effect proved to be unaffiliated to the presence of other polymorphisms in this region, as well as to the cell type or EGFR expression level [67]. e observation that the response on TKI only partly depends on the presence of EGFR activating mutations [68] opened the question of the yet-unexplained difference in therapy outcome, for which -216G>T polymorphism seemed like a reasonable answer. erefore, most of the studies investigating the association between EGFR polymorphism and NSCLC TKI-based treatment included -216G>T. Many of them revealed that it significantly improves treatment outcome [25,56,69] and increases the risk of treatment-related toxicity [56,57,70]. However, some failed to observe such associations [32,61,69,71], thus the overall conclusion regarding the importance of -216G>T has not been reached so far. In the present meta-analysis, whose advantage over previous publications lies in higher validity, reliability of the results [72], EGFR -216G>T was significantly associated with both OS and PFS in TKI-treated NSCLC patients. Our results therefore suggest the possibility that this EGFR polymorphism can be used as an easy-toobtain and ever-present additional predictive factor in these patients, which could simplify the decision-making process during prescribing and improve the outcome of the therapy. It should be noted, however, that all the reports included in our study were based on either gefitinib or erlotinib treatment, thus our conclusions might not be necessarily relevant to the therapies based on newer TKIs, whose mechanism of action is slightly different [19,73]. e first intron of EGFR has an important regulatory function, which relies on the presence of an enhancer element that stimulates promoter activity [74]. EGFR SNP rs11568315 is located close to enhancer in EGFR intron 1 and represents a variable simple sequence repeat (SSR) consisting of 14 up to 21 CA dinucleotides [75]. It has been observed that the transcription activity of EGFR declines with the increasing number of CA repeats, most probably due to alteration in DNA secondary structure, but also that this effect can be outweighed by other regulatory mechanisms [76]. To determine the possible role of this polymorphism in response to TKIs, numerous studies investigated NSCLC, but also other types of cancer whose therapy is EGFR-targeted. Some of the published reports conform in conclusion that the number of CA repeats affects the outcome of TKI-based therapy, with lower number of CA repeats corresponding to higher response rate [55,60,77,78], longer time-to-progression [55,56,58,59,70], longer survival [58,59,77], and increased toxicity [79]. However, others did not detect any significant association [25,26,57,61], deeming this EGFR polymorphism to be clinically unimportant. Out of eight CA repeat-related studies involved in our systematic review, in three the influence on survival has been reported, with carriers of 16 CA repeats (representing the shorter and the most frequent allele [80]) having longer OS [58,59], and carriers of alleles shorter than or equal to 16 CA having longer PFS [56,59], as compared with other NSCLC patients on gefitinib therapy. e present meta-analysis confirmed the observed effect of variable CA repeats on OS. e possible reasons of conflicting results in the literature might be the lack of consensus in regard to cutoff values defining shorter versus longer CA repeats [26,81], the presence of linkage disequilibrium with other functional SNPs that remained undetected or unexplored [56,66,78], or interethnic differences in the allelic distribution [80]. Yet, our results indicate that the length of CA repeat in EGFR intron 1 could be used as another predictive marker for the outcome of TKI-based therapy in NSCLC patients. e last two EGFR SNPs reported to affect the outcome of TKI-based treatment of NSCLC, namely, rs2293347 (D994D) and rs4947492, are currently the least explored. Both are localized within regulatory regions, as former resides in exon 25, i.e., within C-terminal domain [82], and later in the first intron of EGFR [83]. So far, the role of rs2293347 in the treatment of NSCLC patients has been investigated in three different studies [31,32,78], and all of them reported significant association of this polymorphism and the response to TKIs. Yet, while Ma et al. [78] and Zhang et al. [31] associated the presence of variant allele with shorter OS, shorter PFS, and lower response rate, Winther-Larsen et al. [32] reported the opposite, with variant (albeit major) allele carriers on gefitinib therapy exhibiting higher disease control rate and longer OS and PFS. is EGFR SNP is synonymous; hence, it does not lead to a change of the amino acid sequence. Nevertheless, it has been confirmed that even synonymous variations could alter protein amount, structure or function, by affecting mRNA stability, translational kinetics, and splicing [84]. Having in mind the localization of rs2293347, i.e., its proximity to TK domain [82], as well as the contradictory reports regarding its role in TKI efficacy and safety [31,32,78], this EGFR SNP could be considered a good candidate for future clinical trials. On the other hand, only one study of NSCLC treatment with TKIs evaluated the role of rs4947492 [31], reporting significant association of the variant allele with shorter OS [31]. is variation is believed to alter EGFR expression [31], yet linkage disequilibrium with other SNPs could also explain or affect its role in TKI-related treatment [83]. Anyhow, the observed effect would need further confirmation. e present study harbors several limitations, including the lack or incompleteness of data regarding additional treatments used in included studies, which could affect the overall outcome of the therapy. Also, linkage disequilibrium that has been described among different EGFR SNPs, or between SNPs and EGFR activating mutations, was not always taken into account. In addition, only publically available reports were included in our study, thus the possibility of a publication bias cannot be completely excluded. Other types of bias that might affect included studies, e.g., selection bias and information bias, might be present too. Also, it would have been valuable to stratify our findings according to sociodemographic characteristics and/or environmental effect modifiers, but this was not feasible since the original datasets were not available to us. Finally, the number of available studies for most of the investigated SNPs was insufficient for any sound conclusion to be drawn. Nevertheless, our study has several advantages. We used comprehensive and rigorous methodology to obtain all available eligible studies. Quality of the included studies was rather high, confirmed by the appropriate quality measurement tools. Statistical power of our analyses was considerably increased in respect to any single study, because of a bigger number of cases that were pooled for different SNPs.
In conclusion, our study shows that out of ten investigated EGFR SNPs (rs11543848, rs11568315, rs11977388, rs2075102, rs2227983, rs2293347, rs4947492, rs712829, rs712830, and rs7809028), only four, namely, rs712829 (-216G>T), rs11568315 (CA repeat), rs2293347 (D994D) and rs4947492, have been reported to affect the outcome of TKI-based NSCLC treatment. Of these, only -216G>T and variable CA repeat polymorphisms have been confirmed by meta-analysis of available data to significantly affect OS and PFS in gefitinib-or erlotinib-treated NSCLC patients. To ascertain whether these SNPs affect the response to other TKIs, as well as whether other EGFR SNPs have a role in NSCLC treatment, additional studies are warranted.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors' Contributions
Vladimir Jurisic and Vladimir Vukovic contributed equally to this work.

Supplementary Materials
Supplementary Table 1: methodological quality of the included studies: based on A, the Newcastle-Ottawa Quality Assessment Scale for cohort studies, and on B, Jadad scale for RCT. Supplementary