Interval Cancers in a Population-Based Screening Program for Colorectal Cancer in Catalonia, Spain

Objective. To analyze interval cancers among participants in a screening program for colorectal cancer (CRC) during four screening rounds. Methods. The study population consisted of participants of a fecal occult blood test-based screening program from February 2000 to September 2010, with a 30-month follow-up (n = 30,480). We used hospital administration data to identify CRC. An interval cancer was defined as an invasive cancer diagnosed within 30 months of a negative screening result and before the next recommended examination. Gender, age, stage, and site distribution of interval cancers were compared with those in the screen-detected group. Results. Within the study period, 97 tumors were screen-detected and 74 tumors were diagnosed after a negative screening. In addition, 17 CRC (18.3%) were found after an inconclusive result and 2 cases were diagnosed within the surveillance interval (2.1%). There was an increase of interval cancers over the four rounds (from 32.4% to 46.0%). When compared with screen-detected cancers, interval cancers were found predominantly in the rectum (OR: 3.66; 95% CI: 1.51–8.88) and at more advanced stages (P = 0.025). Conclusion. There are large numbers of cancer that are not detected through fecal occult blood test-based screening. The low sensitivity should be emphasized to ensure that individuals with symptoms are not falsely reassured.


Introduction
Colorectal cancer (CRC) is the third most common cancer and the fourth leading cause of cancer death worldwide. About 1.36 million people are diagnosed annually with CRC, and approximately 694,000 die from CRC annually [1]. Approximately 54% of CRC cases are diagnosed in developed countries, and Europe represents one of the regions with the highest rates in both incidence and mortality. As the sojourn time for CRC is several years and a good prognosis is associated with diagnosis at early stage, screening has been implemented in many countries [2]. The rationale behind cancer screening programs is that early detection of cancer (before symptoms arise) will reduce cause-specific mortality [3].
Compelling and consistent evidence from randomised controlled trials shows that fecal occult blood test and flexible sigmoidoscopy reduce CRC mortality [4,5]. However, screening also has the potential to harm. The relationship between benefits and risks depends on the quality of screening [6,7]. The monitoring of interval cancers (IC) is a crucial part of the evaluation of a CRC screening program and a key performance indicator. It provides a mechanism to evaluate the likely impact of the program on CRC mortality in the target population. IC are those that occur following a negative screening episode, in the interval before the next invitation to screening is due [8]. For fecal occult blood testing IC may occur following a negative test, or following a positive test result with a negative further assessment (colonoscopy) [9]. Although IC are inevitable in a screening program, their number should be as small as possible since a high proportion would decrease screening effectiveness. Four plausible reasons (missed polyps or CRC, incompletely resected polyps, rapid progression of new polyps, and failure of biopsy to diagnose a CRC that was present) have been proposed to explain IC [10]. Two studies concluded that 50% to 75% of interval CRC were likely the result of missed or incompletely resected lesions and less than 30% were the result of rapidly progressing lesions [11,12].
Calculation of the IC rate also allows the calculation of other performance indicators, such as the proportional incidence and program sensitivity. The proportional incidence method compares the incidence rate of IC in successive periods after a negative screen with the expected incidence in the absence of screening. The difference between the two rates gives the number and proportion of CRC whose diagnosis has been advanced by screening [13]. On the other hand, program sensitivity (traditional method) compares screen-detected CRC (SD) with IC. Such a method implies some degree of overestimation of sensitivity, particularly when determined at the first prevalence screening [14]. Proportional incidence also has its limitations, mainly for the difficulty of estimating underlying incidence in absence of screening (i.e., when a cancer registry is lacking or when screening has been ongoing since a long time) [14].
The aim of this study was to analyze IC among participants in the screening program for CRC of L'Hospitalet de Llobregat during four screening rounds. As a secondary objective, program sensitivity was analyzed according to demographic, screening, and tumor characteristics.

Screening Procedure.
In 2000, a biennial screening program was launched in L'Hospitalet of Llobregat, an industrial city in the metropolitan area of Barcelona (Catalonia, Spain). The target population includes all men and women aged 50-69 years who lived in the city (average of 65,000). Demographic data on this population is gathered from the Primary Healthcare Information System. L'Hospitalet of Llobregat is divided into 12 Basic Health Areas and screening invitations are sent to eligible population assigned to each one of the Basic Health Areas. Subjects who do not meet the inclusion criteria for CRC screening are definitely or temporarily excluded according to the following criteria: personal history of CRC or adenomas, hereditary and familial CRC, inflammatory bowel disease, colonoscopy in the previous 5 years, fecal occult blood test in less than 2 years, terminal disease, and severe disabling condition. Subjects moving out of the screening area or whose invitation letter is returned because of an invalid mailing address are also excluded.
CRC screening criteria are assessed by means of a questionnaire. When the questionnaire reports two or more relatives with cancer, individuals are phoned through the program to check the information and evaluate whether he/she is eligible for CRC screening with fecal occult blood testing or met the criteria for hereditary colorectal cancer. If a high risk family history was confirmed, the individual is excluded from the program and is referred to a genetic counseling unit for a more detailed assessment. The program allows the screening of individuals with a family history of CRC or other noncolonic neoplasms as long as they do not meet the criteria for hereditary cancer.
Since 2000, two screening test strategies have been used. From the first to third rounds, a guaiac FOBT (gFOBT) was used as the screening test (hema-screenTM, Immunostics Inc., Ocean, NJ, USA). In the fourth round, the gFOBT was offered to 50,227 individuals (eligible population assigned to 10 Basic Health Areas) and a quantitative immunochemical test (OC-Auto Sampling Bottle 3, Eiken Chemical Co., Japan) was offered to 12,707 (eligible population assigned to two Basic Health Areas). The immunochemical test (FIT) was initially introduced to the screening program to evaluate its feasibility and acceptability. Briefly, participation was higher among individuals who used the FIT (OR: 1.35; 95% CI: 1.27-1.42). Detection rates for adenomas and cancer were also higher for the FIT, highlighting the detection rate for highrisk adenomas (26.7‰ versus 3.0‰). The positive predictive value for high-risk adenomas was quite similar (45.0% and 46.9% in the FIT and gFOBT, resp.) [15]. As a result, FIT remained the only strategy for further screening rounds (fifth round and onwards).
Participants with gFOBT collected six fecal samples (two samples from three separate bowel movements). The gFOBT uses a chemical indicator that shows a color change in the presence of blood. The possible results of the gFOBT were (a) weakly positive: one to four positive samples. Those participants with a weakly positive result were asked to perform a second gFOBT and, if any sample was positive, were offered colonoscopy without further testing. In contrast, if all six samples were negative, a third gFOBT was requested; (b) spoilt kit/technical failure: laboratory was unable to analyze the kit. The most common reason for a rejected kit was that the information provided with the kit was not complete. Those participants who refused to repeat the test after a weak positive or a spoilt kit/technical failure were coded with an indeterminate gFOBT result; (c) negative: zero out of six positive samples; (d) strongly positive: five or six positive samples.
On the other hand, participants with FIT collected one sample (approximately 10 mg of feces) which was added to 2 mL buffer. A 100 ng Hb/mL cut-off (20 mg Hb/g feces) was used as threshold for test positivity. Tests were assayed generally on the day of receipt in the laboratory on one of two automated clinical analyzers (OC-Sensor Micro or OC-Sensor Diana). Samples were at 2 ∘ -8 ∘ C if not analyzed on the day of receipt and then allowed to warm to room temperature for the assay. Each sample was analyzed once. The upper limit of the analytical working range for the fecal Hb concentration measurements was 1,000 ng Hb/mL buffer (200 mg Hb/g feces); samples with concentrations greater than this were not diluted and not reassayed. When laboratory was unable to analyze a test (spoilt kit or technical failure), the participant was asked to repeat the FIT. If she/he refused, then the final test result was coded as an indeterminate.
All participants with a positive test result were advised to have colonoscopy. Subjects with no colorectal lesions Gastroenterology Research and Practice 3 detected in the colonoscopy are invited for screening again after 10 years (if they are still within the target age group).

Study Population.
The study population consisted of participants of a CRC screening program from February 2000 to September 2010, with a minimum of 30-month follow-up ( = 30,480). The period of study included four rounds of the CRC screening program.

Data Collection and Variables.
The CRC screening program information system included data on individual identification, age, gender, participation, appointment dates, screening test (guaiac or immunochemical), final screening test results (positive, negative, or indeterminate), and colonoscopy results (negative, precursor lesions, and CRC).
We used hospital administration data (minimum data set (MDS)) to identify CRC. A SD was defined as an invasive CRC diagnosed at colonoscopy triggered by a final positive screening test result. On the other hand, IC was defined as an invasive CRC diagnosed following a negative screening episode and prior to the next scheduled screening examination. The next scheduled screening examination was defined to be 30 months after the previous screen. Screening interval was 24 months; however, it should be considered adequate up to 30 months (acceptable delay because of organizational and management issues). Our screening program was launched as a pilot program and was considered an established program by the third round. The median time between invitations was 33 months (higher in the earlier rounds and descending in the subsequent rounds).
Electronic medical records of the individuals identified as being diagnosed for CRC were revised to gather tumor characteristics, for instance, the anatomic pathology result of the cancerous lesion and the extension study. CRC were staged according to the tumor-nodal-metastasis (TNM) staging system and classified as early (TNM I/II) or late (TNM III/IV) stage. Tumor site was grouped in three categories: proximal defined as the region of the colon up to and including the splenic flexure, distal including the descending and sigmoid colon, and rectum.
Patients were classified into 4 groups according to participation in the screening program: (1) individuals with CRC diagnosed by screening; (2) individuals with a negative screening result and CRC diagnosed prior to further screening round; and (3) individuals with an incomplete screening process: (a) indeterminate test, (b) screenees with a positive test who did not attend the colonoscopy and were diagnosed during the interval as they became symptomatic; (4) individuals clinically diagnosed with CRC after 30 months of their last screening (mainly nonattenders in further screening rounds).
The program sensitivity was expressed as the number of SD divided by both SD and IC following a normal screen or assessment (traditional method).
The study protocol was approved by the Clinical Research Ethics Committee of the Bellvitge University Hospital and all involved parties followed the ethical requirements set forth in the Spanish Organic Law on Protection of Personal Data (15/1999 of December 13). round the gFOBT was offered to individuals in 10 Basic Health Areas and the FIT was offered to individuals in two Basic Health Areas; * 11 individuals using gFOBT and 7 individuals using FIT were not referred to further assessment because they had a recent colonoscopy.

Statistical Analysis.
A descriptive analysis of all screenees diagnosed with CRC was carried out. Program sensitivity was calculated according to demographic, screening, and tumor characteristics.
Factors associated with IC were analyzed using multivariate logistic regression models. The variables included in the multivariate analysis were gender, age, number of screens, tumor site, and CRC stage. The results were expressed as odds ratios (OR) with 95% confidence intervals (95% CI). Differences were considered statistically significant when < 0.05. All analyses were performed using R statistical software (R Foundation for Statistical Computing, Vienna, Austria).

Results
Within the study period, 301 CRC were diagnosed in the screened population ( Figure 1). From those, 97 tumors were detected during the screening process and 93 tumors were diagnosed within 30 months after the screening examination. IC were diagnosed after a negative result in the screening process ( = 74). In addition, 17 clinically diagnosed CRC (18.3%) were found after an inconclusive result and 2 cases were diagnosed within the surveillance interval (2.1%).
On the other hand, 111 CRC were found among symptomatic individuals who did not attend screening in further rounds; they were diagnosed with CRC after 30 months of their last screening (range from 31 months to 12.2 years). From those, 97 completed the screening process and had a negative result.
Around 85% of the screenees used the gFOBT as the screening strategy and the remaining 15% used the FIT (Table 1). Regarding the clinically diagnosed tumors, eight CRC were diagnosed in individuals who used the FIT and half of the CRC were detected within 30 months after their participation in the screening ( Table 2).
The overall program sensitivity was 56.7% and was higher among men and young individuals (   the test used, sensitivity was 52.0% with gFOBT and 87.0% with FIT. Table 4 reports the results of the multivariate logistic regression model. When compared with SD, IC were predominantly in the rectum (OR: 3.66; 95% CI: 1.51-8.88). When stratified by gender, no differences in cancer site were found. This held for both SD and IC (Table 5).
Finally, differences in cancer staging were observed. Individuals with a CRC diagnosed within 30 months after their last screening had more advanced stages ( = 0.025).

Discussion
The overall program sensitivity was relatively low. A high false negative rate should be emphasized to clinicians and patients to ensure that those with symptoms are not falsely reassured and slip through the diagnostic net [16].
Otherwise, sensitivity was higher in those programs using FIT (from 71.3% to 85.6%) [23,24]. It is worth mentioning that program sensitivity decreases as more screening rounds are included in the study. Although in our study FIT sensitivity was considerably higher compared to gFOBT sensitivity (87.0% versus 52.0%, resp.), we have to be cautious due to the small sample size of screenees using FIT. Around 18% of CRC were identified in individuals who did not complete the screening process because of an inconclusive FOBT or refusal of further assessment (colonoscopy). Strategies aimed at eliminating or reduce inconclusive test results should be implemented. In this way, both screening quality and program sensitivity would improve.
Efforts in communicating the need for being screened regularly should be made. One-third of CRC identified among individuals who had ever participated in the CRC screening program were diagnosed after 30 months of their last screening. Once in a lifetime screening is not enough, especially if the screening modality is FOBT. On the contrary, successive screening is needed to detect and prevent CRC. The sojourn time (the period when a test can detect asymptomatic disease) is being estimated in 2.2 to 4.9 years for gFOBT [25,26].
However, some of the CRC clinically diagnosed among individuals after 30 months of their last screening could be averted if they would have been invited within an adequate screening interval. Time between invitations from the early rounds (first and second screening round) was on average 36 months due to some management and organizational concerns. By the third round, time between invitations was acceptable.
Regarding factors associated with IC, differences in sensitivity according to tumor site and cancer stage have been found. We did not find differences in the IC according to gender. However, some studies have shown higher IC rates among women [21,22,27,28].
Most of the studies have shown that IC were more frequently located in the proximal colon [18,22,27]. Tazi et al. found IC were more likely to arise in the rectum compared with SD [19].
Finally, our findings regarding IC being diagnosed at more advanced stages are consistent with the literature [19,20,22,27,28]. OR a adjusted odds ratios derived from multivariate logistic regression models. Analysis was restricted to those individuals with a negative result in their last screening that have been diagnosed with CRC later on and those who were diagnosed during the screening process. * Age at diagnosis was considered as a continuous variable; the age range was 50-73 years for individuals with a CRC diagnosed within 30 months after their screening and 55-80 years for individuals who ever participated in the screening program and were diagnosed with CRC. Among the strengths of this study it is worth mentioning that it is the first study to report IC in a population-based screening program for CRC in Spain. In addition, the length of the study period allows estimating program sensitivity in both prevalent and incident rounds. Most of the studies aimed at analyzing IC have included only two rounds.
Some of the limitations of this study are related with case ascertainment and completeness of cancer data. Interval cancers should be ideally identified through a cancer registry. Checking hospital discharge records may be very useful in areas uncovered by a cancer registry as in our case. However, such a method tends to ignore cases treated outside the National Health System and/or having no hospitalization and those who migrate to a different region. In our study, 85% of CRC diagnosed in private hospitals could be identified. However, there was some missing information regarding tumor characteristics such as the cancer site or stage.
We could not perform a multivariate logistic regression analysis stratified by the test used (gFOBT or FIT) because of the small number of cases. However, as the FIT remained the only one screening strategy by fifth round and onwards, we will be able to evaluate FIT sensitivity and identify whether there are differences according to demographic, screening, and tumor characteristics.
As far as we know, only one study in Spain validated the ability of hospital administration data set to detect incident cases of CRC using a cancer registry as the gold standard and measured agreement between the MDS and registration of CRC [29]. The study population consisted of incident cases of CRC in 2000 obtained from the cancer registry and cases in the MDS of a public hospital for the same year. Around 2% of CRC identified through the cancer registry did not require hospitalization. The MDS detected 85% of the cases and the main reason to not identify cases was differences regarding the incidence date (12 out 13 cases that did not match were found in 2001). We revised CRC from 2000 to 2013, so this potential error should be minimized.
Another limitation is that we calculated the program sensitivity using the traditional method and without taking into account the CRC sojourn time. We could not calculate proportional incidence because we do not have a cancer registry that covers our screening area. However, the program 7 sensitivity may be more easily interpretable than the proportional incidence and is not dependent on the underlying incidence of disease in the screened population.

Future Research.
Further research is needed to discriminate IC due to a false negative screening result from interval cancers de novo, with a rapid tumor growth. Recent biological data have begun to suggest that a proportion of IC exhibits altered biological features that may contribute to their rapid development relative to those that are detected on screening [30]. Thus, it remains necessary to further establish and expand our understanding of molecular pathways involved in IC to modify and/or develop appropriate screening programs and specific treatment options to combat this unique form of CRC [31,32].
Association of molecular features (i.e., BRAF and KRAS mutations, CpG island methylator phenotype, and sporadic microsatellite instability) with dietary and lifestyle factors also needs to be explored [33,34].

Methodological Issues for Benchmarking and Comparison of IC.
There is some variation regarding how IC are actually defined, identified, and reported in different programs. Differences in definition, quantification, or completeness of identification of IC between programs distort the underlying differences in IC frequency and obscure interpretation of this measure [35].
Operational definition and quantification method for interval cancer are needed to eliminate or control some sources of artefactual variation across programs. The accuracy of identification of interval cancer may potentially be the largest source of error and discrepancy between programs [35].
There are a few expert working groups that are making efforts to that end (NHS Bowel Cancer Screening Program, World Endoscopy Organization, Spanish Network of Cancer Screening Programs) [36,37]. Their aim is to define IC to facilitate benchmarking and comparison of IC rates across programs and regions. Incomplete follow-up in the screened population and imprecise linkage between data sources are the main limiting factors in the identification of IC.

Conclusion
There are large numbers of cancer that are not detected through fecal occult blood test-based screening. Screening programs using gFOBT should take into account that initiatives aimed at decreasing inconclusive results could improve screening quality and program sensitivity. Nevertheless, many European countries that have introduced populationbased screening programs based on gFOBT have already switched to immunoassay as their screening test of choice. As a consequence, there are fewer individuals using FIT with an incomplete screening episode and the overall sensitivity is higher compared with gFOBT screening programs.
On the other hand, high CRC rates among nonattenders in further screening rounds highlight the need to better inform our target population that successive screening is required to avert or early detect CRC.
Variation in calculating the interval cancer rate makes it vital that a clearly defined protocol is established for the definition, identification, and reporting of interval cancers. International comparisons and benchmarking of IC could lead to better understanding of the relationship between programs performance and screening practices.