Gastric Cancer (GC) is the fifth most common malignancy worldwide [
In recent years, serum PG detection in high-risk GC populations has been used for primary screening, followed by endoscopy, with relative success. Decreased pepsin levels are associated with an increased risk of GC [
This meta-analysis (MA) collected nearly fifteen years of Asian and European serum PG screening data to evaluate its accuracy for screening GC. Our objective was to provide evidence for the effectiveness of serum PG to diagnose GC in a clinical setting.
We searched PubMed, EMbase, the Cochrane Library, CNKI, WanFang, VIP, and CBM databases. The relevant professional documents were retrieved manually. The search period was from January 2003 to January 2018. Diagnostic tests of PG for GC were obtained using keywords and search terms including pepsinogen, GC, stomach cancer, stomach neoplasms, and gastric neoplasms.
We included all studies in Chinese and English in which PG (PGI or PGII) was used as a diagnostic test for GC in the last 15 years. All studies had literature that could be extracted as complete tables and used the PGI/PGII ratio (PGR) and/or the PGI levels as an index to evaluate GC, providing definite diagnostic thresholds. In all studies, pathological examination or barium meals followed by X-ray were the gold standard for diagnosis. The outcome measures included sensitivity (Sen), specificity (Spe), positive likelihood ratios (+LR), negative likelihood ratios (−LR), and area under the receiver operating characteristic (SROC) curves (AUC).
The exclusion criteria included abstracts from meetings, studies with ambiguous measurement indexes, and incomplete or unextractable data. Studies in which the data quality was deemed poor and studies with repeatedly published results were also excluded. Studies were excluded if a complete evaluation was not performed using gold standard tests or if PG was combined with other indicators of GC diagnostic assessments.
Two reviewers independently screened the manuscripts, extracted data, and performed quality evaluation according to the inclusion and exclusion criteria. Disagreements were discussed or referred to third-party experts for adjudication. Data were extracted from studies (1) that included the first author, study location, and time of publication; (2) that were of sufficient sample size and considered age and gold standard evaluations; (3) that included outcome indicators including true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN); and (4) that included quality evaluations of the key elements.
Quality evaluation of the included studies was assessed using the QUADAS tool for the diagnostic evaluation of systematic reviews. The ratings were divided into “yes,” “no,” and “unclear”: “yes” if the standard was satisfied or “no” if unsatisfied and “unclear” if the information could not be accurately obtained.
Meta-Disc 1.4 software was used for all statistical analyses, through the assessment of the effect of the odds ratio (OR) and 95% CI. For heterogeneity tests,
A total of 2287 studies were retrieved, 19 of which were included in the final analysis. The gold standard of 16 studies was pathological diagnosis, whilst three studies used a barium meal followed by an X-ray. A total of 169,009 cases received PG screening, of which 67,218 received the gold standard tests (15,566 patients underwent X-ray barium meal tests, and 51,552 patients underwent pathological examination). Screening processes are outlined in Figure
Literature screening process.
The basic characteristics of the study are included in Table
Basic characteristics of the study.
Reference | Location | Sample size | Age | Gold standard | TP | FP | FN | TN | Critical value | Inspection method |
---|---|---|---|---|---|---|---|---|---|---|
Li et al. [ |
Hebei, China | 720 | — | Pathology | 50 | 136 | 73 | 461 | PGR ≤ 6.0 | TRFIA |
Zhang et al. [ |
Hebei, China | 720 | — | Pathology | 18 | 38 | 105 | 559 | PGI < 60 |
TRFIA |
Kang et al. [ |
Korea | 1006 | 57.6 ± 13.2 | Pathology | 98 | 63 | 67 | 102 | PGR ≤ 3.0 | LEI |
Yu et al. [ |
Beijing, China | 2668 | — | Pathology | 60 | 279 | 88 | 2241 | PGI ≤ 0.7 |
ABA |
Mizuno et al. [ |
Japan | 12,120 | 15~84 | Barium meal | 7 | 486 | 12 | 116 |
PGI ≤ 30 ng/mL |
CT |
Miki et al. [ |
Japan | 101,892 | — | Pathology | 115 | 902 |
10 | 464 |
PGI ≤ 70 ng/mL |
RIA |
Zhang et al. [ |
Gansu, China | 918 | >50 | Pathology | 3 | 197 | 4 | 714 | PGI ≤ 70 |
ABA |
Zhang et al. [ |
Gansu, China | 1502 | — | Pathology | 6 | 576 | 3 | 917 | PGI ≤ 70 ng/mL |
ELISA |
Yuan [ |
Liaoning, China | 21,338 | 10~87 | Pathology | 69 | 656 |
39 | 146 |
PGR ≤ 7.0 | ELISA |
Wei et al. [ |
Hebei, China | 753 | >35 | Pathology | 3 | 197 | 4 | 549 | PGI ≤ 75 |
LEI |
Xu et al. [ |
Jiangsu, China | 1028 | 22~91 | Pathology | 43 | 146 | 15 | 824 | PGI ≤ 70 ng/mL |
LEI |
Lomba-Viana et al. [ |
Portugal | 13,118 | 40~79 | Pathology | 6 | 268 | 3 | 237 | PGI ≤ 70 ng/mL |
ELISA |
Nakajima [ |
Japan | 1000 | — | Barium meal | 4 | 196 | 1 | 799 | PGI ≤ 70 ng/mL |
RIA |
Zhao et al. [ |
Shanxi, China | 725 | — | Pathology | 5 | 180 | 13 | 527 | PGI ≤ 70 |
ELISA |
Yuan [ |
Shandong, China | 160 | 65.2 ± 4.8 | Pathology | 34 | 26 | 26 | 74 | PGI ≤ 70 ng/mL |
ELISA |
Zhang et al. [ |
Beijing, China | 518 | 13~86 | Pathology | 102 | 32 | 23 | 151 | PGI ≤ 62.5 |
CMI |
Shikata et al. [ |
Japan | 2446 | — | Barium meal | 49 | 731 | 20 | 164 |
PGI ≤ 59 ng/mL |
RIA |
Juan Cai et al. [ |
Xinjiang, China | 464 | 53.3 ± 13.8 | Pathology | 45 | 7 | 61 | 153 | PGI ≤ 72.78 ng/m L |
ELISA |
Castro et al. [ |
Portugal | 5913 | 40~74 | Pathology | 15 | 210 | 11 | 567 |
PGI ≤ 70 ng/mL |
ELISA |
QUADAS quality evaluation.
Inclusion study | (1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | (11) | (12) | (13) | (14) |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Li et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | U | U |
Zhang et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | U | U |
Kang et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | U | U |
Zhonglin et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | U | U |
Mizuno et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | N | N | Y | Y | U | U |
Miki et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | U | U |
Zhang et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | U | U | Y | U | U |
Zhang et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | U | U | Y | U | U |
Yuan [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | U | U |
Wei et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | N | U | U | Y | U | U |
Xu et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | U | U |
Lomba-Viana et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | U | Y |
Nakajima et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | U | U |
Zhao et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | U | U |
Yuan [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | U | U |
Zhang et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | U | U |
Shikata et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | U | U | Y | U | Y |
Juan Cai et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | Y | U | U |
Castro et al. [ |
Y | Y | Y | Y | Y | Y | Y | Y | Y | Y | N | Y | U | Y |
(1) Does the spectrum of cases contain various cases and/or confusing cases? (2) Is the selection criteria for the study object clear? (3) Can the gold standard accurately distinguish sick from disease-free status? (4) Are the intervals between the gold standard and the test to be evaluated short enough to avoid changes in disease conditions? (5) Are all samples or randomly selected samples accepted gold standard tests? (6) Did all cases receive the same gold standard test regardless of the outcome of the trial to be evaluated? (7) Is the gold standard test independent of the test to be evaluated (i.e., the test to be evaluated is not included in the gold standard)? (8) Is the operation of the test to be evaluated described sufficiently clearly and repeatedly? (9) Is the operation of the gold standard test well described and repeatable? (10) Are the results of the test to be evaluated performed without prior knowledge of the gold standard test? (11) Is the interpretation of the outcome of the gold standard test conducted without knowledge of the test results to be evaluated? (12) Is the clinical data available when interpreting the test results and is it consistent with the clinical data available in the actual application? (13) Have you reported any hard-to-interpret/intermediate test results? (14) Have the cases removed from the study been explained?
The ROC plane scatter chart did not display a “shoulder arm” appearance, with the Spearman correlation coefficient being 0.457 and
The MA showed a combined SEN of 0.56 (95% CI (0.53~0.59)) and a combined SPE of 0.71 (95% CI (0.70–0.71)). This indicated that PG did not identify GC in 44% of cases, with misdiagnosis rates of 29% (Figures
Consolidation sensitivity.
Consolidation specificity.
The MA showed a combined +LR of 2.82 (95% CI (2.06~3.86)), indicating that the use of PG screening for GC was positive. The combined –LR was 0.56 (95% CI (0.45~0.68)), indicating that when using PG for GC screening, the possibility of missing GC cannot be ruled out (Figures
Merging positive likelihood ratios.
Merging negative likelihood ratios.
DOR forest maps showed that the combined DOR was 5.41 (95% CI (3.64~8.06)), indicating that positive PG screening was 5.41-fold higher than negatively screened patients, suggesting PG has accuracy for GC diagnosis (Figure
Diagnostic ratios.
From the SROC curves, the AUC = 0.7468 and
ROC scatter plot.
SROC curve.
Metaregression was used to analyze the sources of heterogeneity caused by the nonthreshold effect. Subanalysis was conducted based on regional data, publication date, diagnosis method, detection method, and study quality. The results showed a DOR of 3.98 (
Subgroup analysis summary.
Subgroup | DOR | |||
---|---|---|---|---|
Date of publication | Before 2010 | 3.98 | 80.10% | <0.01 |
After 2010 | 6.24 | 84.00% | <0.01 | |
Country/region | Europe | 8.44 | 94.00% | <0.01 |
Asia | 5.05 | 82.50% | <0.01 | |
Diagnosis method | Pathology | 4.96 | 86.70% | <0.01 |
Barium meal | 8.54 | 44.90% | 0.163 | |
Detection method | ELISA | 4.97 | 86.60% | <0.01 |
Others | 5.57 | 85.20% | <0.01 | |
Document quality | Generally | 3.72 | 84.50% | <0.01 |
Higher | 6.91 | 86.00% | <0.01 |
To exclude the impact of low-quality studies on the MA datasets, all studies were analyzed for sensitivity. The results showed that the DOR of PG screening of GC in each group was ≥3 (
The results of the funnel plot analysis using Stata 12.0 showed that each circle represented an incorporated study that was approximately symmetrical with respect to the distribution of the central axis (
Following lung and liver cancer, GC is the third leading cause of global cancer deaths [
The occurrence and development of GC display regional differences. A significant difference in GC incidence is present between North America and Western Europe, with the highest incidence of GC in East Asia, Eastern Europe, and South America [
PGI is secreted from the gastric fundus gland, whilst PGII is secreted from the glandular body and the pylorus glands in the antrum and proximal duodenum [
Carcinogenesis of GC is a multistage process in which chronic active gastritis develops leading to atrophic gastritis, intestinal metaplasia, atypical hyperplasia, and eventually cancer development (Correa model) [
The International Cancer Research Institute list
The ROC is a widely accepted method for selecting the optimal cut-off value for a diagnostic test, in addition to assessing its sensitivity and specificity. The AUC represents test effectiveness, with an area > 0.9 indicating a high test efficiency, 0.7–0.9 a medium performance, 0.5–0.7 low efficiency, and 0.5 a chance result [
This study had some notable limitations: (1) only Chinese and English studies were searched leading to bias in the study selection. (2) Blinding and randomness of some of the studies were unclear, and the study quality was variable, leading to variations in the obtained data. (3) Due to the inability to obtain age information for all subjects, it was not possible to assess age as a possible confounding factor. (4) Due to the limitations of the included research content, the definition of high-risk groups differed according to regions and detailed experimental methodologies were not reported in detail. Some studies lacked data when classifying the tumor locations/types, meaning the sensitivity and specificity of different types of GC screening may vary. This meta-analysis was based on literature reporting as opposed to direct patient data, also limiting the study.
In summary, we report that PG contributes to the diagnosis of GC displaying moderate diagnostic performance. Although no studies have directly demonstrated that PG screening methods can reduce GC mortality, it does provide a valuable measure to identify high-risk groups who require endoscopy. To provide more scientific and objective references for clinical applications, further research is required using rigorous design, large sample sizes, and multicenter diagnostic assessments. Adopting a unified detection method and strict quality control measures is necessary to reduce bias and to ensure that all research results are of high credibility and strong instructional significance. Following these guidelines can lead to safer, economical, convenient, and accurate methods for screening high-risk groups of GC.
The data supporting this meta-analysis are from previously published studies and data sets, which have been cited. The processed data are available in PubMed.
The authors declare that there is no conflict of interest regarding the publication of this article.
We especially thank all researchers and teachers who participated and made this study possible. Thank you for the support and funding. This work was supported by the Provincial Humanities and Social Science Research Project of Anhui Colleges (no. SK2018A0190).