Cost-Effectiveness Analysis of Biomarker-Guided Treatment for Metastatic Gastric Cancer in the Second-Line Setting

Background The 5-year survival rate of patients with metastatic gastric cancer (GC) is only 5%. However, trials have demonstrated promising antitumor activity for targeted therapies/immunotherapies among chemorefractory metastatic GC patients. Pembrolizumab has shown particular efficacy among patients with programmed death ligand-1 (PD-L1) expression and high microsatellite instability (MSI-H). The aim of this study was to assess the effectiveness and cost-effectiveness of biomarker-guided second-line GC treatment. Methods We constructed a Markov decision-analytic model using clinical trial data. Our model compared pembrolizumab monotherapy and ramucirumab/paclitaxel combination therapy for all patients and pembrolizumab for patients based on MSI status or PD-L1 expression. Paclitaxel monotherapy and best supportive care for all patients were additional comparators. Costs of drugs, treatment administration, follow-up, and management of adverse events were estimated from a US payer perspective. The primary outcomes were quality-adjusted life years (QALYs) and incremental cost-effectiveness ratios (ICERs) with a willingness-to-pay threshold of $100,000/QALY over 60 months. Secondary outcomes were unadjusted life years (survival) and costs. Deterministic and probabilistic sensitivity analyses were performed to evaluate model uncertainty. Results The most effective strategy was pembrolizumab for MSI-H patients and ramucirumab/paclitaxel for all other patients, adding 3.8 months or 2.0 quality-adjusted months compared to paclitaxel. However, this strategy resulted in a prohibitively high ICER of $1,074,620/QALY. The only cost-effective strategy was paclitaxel monotherapy for all patients, with an ICER of $53,705/QALY. Conclusion Biomarker-based treatments with targeted therapies/immunotherapies for second-line metastatic GC patients substantially improve unadjusted and quality-adjusted survival but are not cost-effective at current drug prices.


Introduction
Gastric cancer (GC) is the fifth most common cancer worldwide and the third leading cause of cancer-related mortality, with a 30% five-year survival rate for all stages [1,2]. In addition, 25% of GC patients present with advanced disease, and another 25%-50% progress to metastatic disease [3]. e prognosis is especially poor for nonresponders to first-line (1L) chemotherapy, who face a challenging decision between supportive/palliative care and more aggressive intervention with taxanes, antiangiogenic therapy, or PD-1 blockade. Insurance claims data suggest that, as of 2012, a majority (54.5%) of GC patients who progress on 1L treatment go on to receive a second line of chemotherapy [4]. As novel therapies are incorporated into the second-and third-line (2L, 3L) treatment landscapes, further analysis is warranted to optimize quality-adjusted life years (QALYs) and determine the impact of personalized, biomarker-based treatment on the effectiveness and cost-effectiveness of treatment. e 2018 National Comprehensive Cancer Network (NCCN) guidelines for treatment of chemorefractory, metastatic, or recurrent gastric cancer include the anti-PD-1 agent pembrolizumab (PEM), the VEGFR-2 antagonist ramucirumab (RAM), and a variety of taxanes, such as paclitaxel (PAC) monotherapy and its combination with RAM (RAM/PAC). Recommendations of RAM and PEM were primarily based on the outcomes of the RAINBOW and KEYNOTE-059 trials, respectively, and studies by Le et al. on deficient mismatch repair (dMMR) tumors (2015, 2017) [5][6][7][8][9]. e guidelines recommend PEM as 2L treatment for GC patients with high microsatellite instability (MSI-H) or deficient mismatch repair (dMMR) status and as 3L treatment for GC patients with positive PD-L1 expression using a 1% combined positive score (CPS) threshold [5].
e KEYNOTE-059 and KEYNOTE-061 trials showed promising results of durable response to PEM. Among PD-L1 + patients on PEM, median overall survival (OS) was 9.1 months and median duration of response was 15.7 months, with a range exceeding 23 months [10]. PAC monotherapy and RAM/PAC combination therapy showed similar OS rates, but much lower median durations of response (2.8 months and 4.4 months, respectively) [7].
is warrants further analysis to determine if durability of response makes PEM a cost-effective therapy.
Personalized or biomarker-based treatment may be crucial to the effectiveness and cost-effectiveness of the treatment landscape for advanced GC [11][12][13]. An estimated 14-24% of GC/GEJC patients express PD-L1 in tumor cells, and an estimated 35% in the tumor microenvironment [14].
e KEYNOTE-061 trial reported combined positive scores (CPS) for PD-L1 expression, which includes expression on both tumor and immune cells. Among patients treated with PEM, 66.2% (n � 196) of patients expressed PD-L1 at a 1% CPS threshold [10].
e Cancer Genome Atlas Research Network (TCGA) evaluated 295 primary gastric adenocarcinomas for molecular characteristics and found that 22% had high microsatellite instability (MSI-H) [15]. However, there is variability based on other published estimates. A recent review estimates that 10-39% of gastroesophageal cancers are MSI-H [14], but the proportion seems to be considerably smaller in stage IV tumors [6,10,16,17]. A recent prospective genomic profiling study of metastatic esophagogastric patients reported that only 3% of patients were MSI-H (n � 318) [16].
MSI-H GC patients show significantly higher PD-L1 expression, compared to MSS GC patients (25.9% versus 8.4%, p � 0.003) [18,19] and accumulating data suggest improved survival and durable response to immunotherapy among patients with MSI-H status. e KEYNOTE-059 and KEYNOTE-061 trials report overall response rates (ORR) of 57.1% and 46.7%, respectively, although the numbers of MSI-H patients were small (n � 7 and n � 15, respectively) [6,10]. Despite the small sample sizes, these outcomes are consistent with a pooled ORR to PEM of 53.0% across different cancer sites for dMMR tumors (n � 86) [9].
Our analysis aimed to evaluate the impact of biomarkerbased 2L GC treatment, compared to other commonly prescribed treatment regimens. In the biomarker-based strategies, patients with MSI-H/dMMR or PD-L1 positivity received PEM therapy, and the remaining patients received either PAC monotherapy or RAM/PAC combination therapy. We also evaluated best supportive care (BSC), PAC, PEM, and RAM/PAC for all GC patients on 2L treatment, regardless of biomarker status. e primary outcomes were QALYs as a measure of clinical effectiveness and incremental cost-effectiveness ratios (ICERs).

Methods
We utilized the CHEERS checklist for standardized methodology and reporting of cost-effectiveness analyses (Supplementary Table 2).

Patient Characteristics.
Our model simulated a hypothetical cohort of patients, which was constructed using pooled patient characteristics from the KEYNOTE-059, KEYNOTE-061, REGARD, and RAINBOW trials (Supplementary Table 1) [6,7,10,20]. e modeled cohort contained 60 year olds (70% male, 30% female) with unresectable recurrent/metastatic gastric adenocarcinomas undergoing second-line (2L) therapy. In accordance with the patient populations in the clinical trials, gastric adenocarcinomas include proximal, distal, and gastroesophageal junction (GEJ) cancers. e cohort entered the model following disease progression on a previous line of chemotherapy with a platinum and/or fluoropyrimidine, as well as trastuzumab in HER2 + patients. For the base case, we assumed that 40% of patients expressed PD-L1 at a CPS > 1% threshold and 10% of patients were MSI-H. ese proportions were estimated based on proportions in the clinical trials and the literature [6,10,[14][15][16][17]21].

Treatment Model.
A decision-analytic Markov model ( Figure 1) was constructed in Python to analyze 2L treatments for GC patients. Patients could remain stable, experience disease progression, and die from cancer, severe adverse drug reactions, or age-and sex-specific all-cause mortality. Patients could present with treatment-related adverse events (trAEs) within the first 6 months of the model. After disease progression, patients were placed on best supportive care (BSC) as defined by the NCCN [5]. e model simulated a 60-month treatment and follow-up window, or 5-year time horizon, in monthly cycles. is time horizon was selected to capture the potential durable response to immunotherapy suggested in the literature [9,16].
We compared eight unique treatment strategies that are endorsed by current NCCN guidelines and also recently studied within clinical trials [5]. In the first four strategies, all patients received the same treatment, regardless of biomarker status. ese strategies were (1) BSC with no active treatment, (2) PEM monotherapy (200 mg flat dose or 2 mg/ kg every three weeks), (3) PAC monotherapy (80 mg/m 2 three times per month), and (4) combination therapy with PAC (80 mg/m 2 three times per month) and RAM (8 mg/kg; every two weeks). In the remaining four strategies, treatment was based on biomarker status: MSI/dMMR or PD-L1 CPS of 1%. Patients with a positive biomarker status received PEM, and the rest received either PAC monotherapy or RAM/PAC combination therapy.

2L Cancer Progression and Mortality
Estimates. Rates of disease progression and cancer mortality in each strategy were derived from the Kaplan-Meier (KM) estimates of overall survival (OS) and progression-free survival (PFS) of the respective clinical trials using piecewise linear fits to the curves (Table 1). Supplementary Figure 1 compares model outputs to clinical trial data. e PEM arms (including whole-population, MSI-H, and PD-L1 + subsets) were informed by data from the phase 3 open-label KEYNOTE-061 trial on pretreated gastric and GEJ adenocarcinomas [10]. PFS for the MSI-H subset was modeled using pooled data for all noncolorectal cancer sites [9]. Second-line treatment with PAC monotherapy and RAM/PAC was informed by data from the double-blind randomized phase 3 RAINBOW trial comparing the two treatments [7]. e BSC arm was informed by data from the placebo plus BSC group of the REGARD trial [20]. e same data were used to estimate the probability of cancer-specific mortality after disease progression and discontinuation of treatment in all strategies.
Age-and sex-specific all-cause mortality were estimated using 2014 US data from the Centers for Disease    Journal of Oncology Control [40]. To be consistent with the patient population in the clinical trials, the probability of all-cause mortality was a weighted sum of male and female mortality data in a 70 : 30 ratio. When needed, the OS and PFS data were extrapolated beyond the time horizon of the clinical trials (∼24 months) by extending the KM curves and all-cause mortality rates.

Treatment-Related Adverse Events.
e monthly probabilities of treatment-related adverse events (trAE) and related death or treatment discontinuation associated with each strategy were estimated as weighted averages from the safety data of the respective trials, namely, the KEYNOTE-012, KEYNOTE-059, KEYNOTE-061, and KEYNOTE-006 (melanoma) trials on PEM and the RAINBOW trial on RAM/PAC (Table 1) [6,7,10,20,30,31]. ese probabilities were multiplied by the proportion of patients receiving treatment in each cycle (i.e., patients with stable disease). Associated costs and disutilities were then applied based on the resulting percentages. We assumed that patients who discontinued treatment entered the progressive disease state. In addition, we assumed that trAEs occurred within the first six months of treatment based on estimates of the timing of trAEs in the literature [31,41]. Late-onset trAEs were neglected due to insufficient data. We also assumed that the toxicity profile for PEM remained consistent across biomarker subgroups. Finally, we accounted for differences in hospitalization rates across treatment arms by modifying the percentage of the population experiencing a grade 3 or 4 trAE, and assuming that only this portion of the population would be eligible for inpatient admission.

Estimates of Costs and Quality of Life.
Cost and quality of life model inputs are summarized in Table 1. We estimated costs of biomarker testing, drug acquisition and administration, and follow-up visits to general practitioners, oncologists, and radiologists from a US payer perspective. e costs of biomarker testing were estimated using the 2019 CMS Physician Fee Schedule and Clinical Diagnostic Laboratory Fee Schedule [23,24]. Drug costs were estimated using the 2019 Medicare Average Sales Price Drug Pricing File, assuming an average patient weight of 65 kg and average body surface area of 1.85 m 2 (weighted average for men and women in a 70 : 30 ratio to reflect clinical trial populations) [32,42,43]. Drug administration costs were derived from the literature and adjusted to reflect the number of infusions administered per month, multiple therapies, and associated costs of premedication [33]. e monetary costs associated with trAEs were assumed to be primarily those resulting from hospitalization costs. ese were estimated as weighted averages of the costs associated with specific grades 3-4 trAEs and the frequency of occurrence of those events in the respective trials [22,33,34,44,45]. Hospitalization due to trAEs was assumed to occur for grades 3-4 only, at a rate that was estimated using insurance claims data [4]. Costs associated with trAEs managed in outpatient settings were assumed to be negligible. All costs were in US dollars (USD).
Quality-adjusted life years (QALYs) were derived by adjusting survival times by health state utility scores (0 � death, 1 � perfect health) as measures of quality of life (QoL). Utility scores associated with PFS and 3L BSC following progression were estimated from the literature and adjusted by the distributions of ECOG scores in clinical trials [25,26]. Published estimates report utility score for a patient with an ECOG score of 1 is 0.1 less than the utility score of a patient with an ECOG score of 0. Reductions in QoL due to grades 1-2 and 3-4 trAEs were accounted for using disutility scores from a previous study on nivolumab for colorectal cancer [36]. As with costs, disutilities for trAEs were estimated as weighted averages for each treatment arm. All costs and utility scores were discounted by 3% annually [46].

Model Outcomes.
e primary outcomes of the analysis were QALYs gained and ICERs, analyzed in the context of an efficiency frontier. ICERs were calculated as the ratio of differences in costs and QALYs between a strategy and the next best alternative. A willingness-to-pay (WTP) threshold Journal of Oncology 5 of $100,000 per QALY gained was used in the base case as the cutoff for cost effectiveness [47]. Secondary outcomes were survival (unadjusted life expectancy) and total costs.

Sensitivity Analysis.
We quantified uncertainty by measuring model output variation in response to changing input parameters via deterministic (one-way) and probabilistic sensitivity analyses (PSA). We focused on the ICER endpoint in these analyses in order to determine if costeffectiveness results would change when changing input parameters. One-way sensitivity analyses were performed by varying one parameter at a time within prescribed bounds and recording the change in ICERs in a tornado diagram. In probabilistic sensitivity analyses, all parameters were sampled simultaneously from probability distributions in 10000 Monte Carlo samples per strategy, and the outcomes were recorded into cost-effectiveness planes. e ranges for the one-way sensitivity analyses and the distributions for the probabilistic sensitivity analyses are presented in Table 2

Base
Case. e base case results for all strategies are presented in Table 3, Figures 2 and 3 (efficiency frontier). Overall survival and progression-free survival for strategies on the efficiency frontier are shown in Supplementary  Figure 2. Biomarker-based strategies substantially increased quality-adjusted survival and unadjusted survival compared to BSC and PAC ( Figure 2). All biomarker-based strategies produced similar QALY outputs, between 0.47 and 0.58 QALYs or a variation of only 1.32 months. e most effective strategy was PEM for MSI-H and RAM/PAC for MSS patients, with a total of 1.2 life years and 0.58 QALYs. is strategy increased unadjusted survival by 8.6 months compared to BSC and 3.8 months compared to PAC. In addition, QALYs increased by 4.4 months compared to BSC and 2.0 months compared to PAC. e effectiveness of biomarker-based strategies improved cost effectiveness, but the ICERs still surpassed the WTP threshold. PAC for all patients was the only cost-

Sensitivity Analyses.
Deterministic and probabilistic sensitivity analyses indicated that our base case results were robust to uncertainty in model parameters. e tornado diagram in Figure 4 illustrates the results of a one-way Total costs and total QALYs increased with MSI-H prevalence, resulting in total costs between $47000 and $111000, and total QALYs between 0.43 and 0.63 (Supplementary Figure 4). However, the ICERs remained quite stable since biomarker prevalence drove both costs and QALYs.

Discussion
Our analysis used a decision-analytic framework to analyze the impact of personalized, biomarker-based regimens for second-line treatment of metastatic gastric cancer. e most effective treatment regimens were biomarker-based, with PEM for MSI-H patients and RAM/PAC for MSS patients resulting in the most QALYs. Our study highlights the benefit of the broader ongoing shift toward personalized, biomarker-based cancer treatment in patients who would have been offered supportive/palliative care in the past. Historically, only about 20% of GC patients went on to 2L therapy, but, with increasing efficacy in the 2L setting, that number has increased to over 50% [4,48].
Perhaps reflecting the diversity of current treatment, the NCCN recommendations for this patient population encompass a broad range of approaches, including basic supportive care. Although our modeling analysis finds substantial benefit for targeted treatment compared to supportive care or conventional chemotherapy (paclitaxel),  Journal of Oncology 7 these newer strategies are expensive and not cost-effective. Improving the precision of predictive biomarkers of response and reducing the cost of immunotherapeutic and biologic agents could continue to improve the cost effectiveness of targeted therapies. Our analysis provides important data for decisionmaking, as clinical trials directly comparing such a large number of potential treatment strategies are not feasible. As a model-based analysis, we were able to incorporate quality of life and cost considerations to estimate outcomes that are of interest to various stakeholders. We also developed our model utilizing the most current data available and provided extensive details regarding our approach and assumptions to provide transparency and diminish concerns about unknown biases in the model (see additional details in Supplementary Materials). Finally, we performed extensive sensitivity analyses to explore the impact of uncertainty in model parameters on analysis results.
We identified several limitations in the current study. In the absence of long-term follow-up data, we extended the Kaplan-Meier estimators of overall and progressionfree survival beyond the endpoints of the associated clinical trials. In addition, we used data for all patients to inform the survival outcomes for MSS/PD-L1 patients receiving PAC or RAM/PAC. Because our model inputs were derived from clinical trial data collected from patients with highperformance status, the generalizability of results to those with limited performance may be limited. Nonetheless, patient characteristics across clinical trials were comparable (see Supplementary Materials). Finally, late-onset severe adverse events associated with targeted therapies and immunotherapies were not included in the model due to insufficient data.
Our analysis was based on currently available biomarkers and treatments. Biomarker discovery is an active area of investigation and future biomarkers that improve the precision of treatment selection could improve both the effectiveness and cost-effectiveness of novel strategies. Although the biomarker-based approach we assessed in our current analysis did not meet the willingness-to-pay threshold for cost effectiveness, the ability to further define the optimal treatment group might continue to reduce the ICER to a point where these therapies achieve cost effectiveness.
Primary drivers of the cost-effectiveness results were the current costs of newer therapeutic agents. e most effective strategies were expensive, rendering those strategies costineffective. A threshold analysis indicated that the cost of PEM would have to be approximately $3200 per month in order for the MSI-H: PEM/MSS: PAC strategy to be costeffective (Supplementary Figure 5). e cost of cancer is a serious matter, especially as more effective and more costly treatments enter the market. Emerging research has found high (28%-48%) rates of treatment-related financial toxicity in cancer patients, with negative consequences for patientreported outcomes and treatment adherence [49]. A combination of improving precision medicine and lowering drug prices could achieve cost effectiveness and improve patient outcomes. Future clinical data will allow for improved model inputs and fewer assumptions than those made for the current analysis.
In conclusion, our modeling analysis finds that personalized, biomarker-based treatments improve both quality-adjusted and quality-unadjusted survival but are not cost-effective, largely because of the prices of therapeutic agents.
Data Availability e cost and effectiveness data used to support the findings of this study are included within the article with citations to prior studies (Table 3). e source code for the model is available from the corresponding author upon request.

Conflicts of Interest
Dr. Neugut was a consultant for Otsuka, United BioSource Corporation, Hospira, Eisai, and Teva, and is a member of the Scientific Advisory Board of EHE Intl. Dr. Manji is a member of the Scientific Advisory Board of Roche/Genentech and received funding from Plexxikon Inc., Merck Inc., and Genentech/Roche. Dr. Hur served as a consultant for Novo Nordisk, Gilead Sciences, Precision Health Economics.

Supplementary Materials
e supplementary materials provide details on additional methods: sensitivity analyses. Supplementary Table 1