Predictors of Long-Time Survivors in Nonmetastatic Colorectal Signet Ring Cell Carcinoma: A Large Population-Based Study

Background Colorectal signet ring cell carcinoma (SRCC) is a rare and distinct subtype of colorectal cancer (CRC), with extremely poor prognosis and aggressive tumor biological behavior. In this study, we aimed to analyze the clinicopathological characteristics and to identify the independent predictors of long-time survivors (LTSs) of nonmetastatic colorectal SRCC. Methods Patients diagnosed with nonmetastatic colorectal SRCC were extracted from the Surveillance, Epidemiology, and End Results (SEER) database. We compared and analyzed the clinicopathological characteristics between LTSs (patients survived over 5 years) and non-LTSs (patients survived of or less than 5 years). Afterwards, multivariate logistic regression analysis was used to identify independent predictors of LTSs, which were further used to construct a nomogram model to predict the probability of being LTSs. Results We enrolled 2050 patients with nonmetastatic colorectal SRCC, consisting of 1441 non-LTSs and 609 LTSs. Multivariate logistic regression analysis revealed that race, marital status, tumor infiltration, lymph node involvement, and primary tumor treatment were independent predictors of LTSs. In addition, these five parameters were incorporated into a nomogram model to predict the probability of being LTSs. In terms of the model performance, the calibration curve revealed good agreement between observed and predicted probability of LTSs, and receiving operator characteristic curve showed acceptable discriminative capacity in the training and validation cohorts. Conclusion Collectively, we analyzed and profiled the clinicopathological characteristics of LTSs in patients with nonmetastatic colorectal SRCC. Race, marital status, T stage, N stage, and primary tumor treatment were independent predictors of LTSs.


Introduction
Colorectal cancer (CRC) remains the second most common cause of cancer-associated mortality [1], posing great challenge to human health. Most CRCs are differentiated adenocarcinomas, followed in order by mucinous adenocarcinomas, signet ring cell carcinomas (SRCC), and squamous cell carcinomas [2]. The description of SRCC was first proposed in 1951 by Laufman and Saphir [3], which is gradually characterized by predominant intracytoplasmic mucin production in tumor cells (>50% size of cell), with the unique appearance of a signet ring [4,5]. Primary colorectal SRCC is a rare and unique entity of CRC, accounting for approximately 1% of all CRC [6]. Despite the low incidence, accumulative attention has been paid to colorectal SRCC due to its rarity, aggressiveness, and distinct biological properties.
According to relevant studies, colorectal SRCC mostly originates from undifferentiated stem cells of colorectal mucosa, which might be the intrinsic cause for the high proportion of poor differentiation/undifferentiation, rapid tumor growth, diffuse infiltration, massive lymphatic involvement, great risk of distant metastasis, and peritoneal metastasis [7,8]. In terms of demographic factors, SRCC is more commonly seen in young populations and female patients [9].
Several studies have investigated the prognostic factors for SRCC, showing that age, sex, tumor grade, tumor size, and primary tumor site are independently associated with patient survival [10]. The prognosis of patients with colorectal SRCC is dismal, with a 5-year overall survival (OS) rate of 25% [11], which is far lower than that of colorectal adenocarcinoma. The treatment of colorectal SRCC has been improved in recent years. Surgical resection remains the mainstay for resectable colorectal SRCC. Other therapeutic approaches, including chemotherapy, radiation, and targeted therapy, have also been widely applied to improve patient prognosis. The advanced clinical management of colorectal SRCC has improved patient prognosis. However, due to the rarity of colorectal SRCC, few studies have investigated the specific characteristics of patients with colorectal SRCC who survive for a long time, which hinders the survival improvement in colorectal SRCC.
To this end, we extracted eligible patients with colorectal SRCC from the Surveillance, Epidemiology, and End Results (SEER) database to retrospectively analyze the clinicopathological characteristics and predictors of long-time survivors (LTSs). For better clinical application, we further constructed an easy-to-use nomogram model to predict LTSs in nonmetastatic colorectal SRCC, followed by the assessment of model performance.

Materials and Methods
2.1. Data Source and Patient Selection. The SEER database is an authoritative source of data for cancer incidence and patient survival by including population-based data from 18 registration centers and covering approximately 30% of the US population [12]. SEER * Stat software (version 8.3.6, released on August 8, 2019) was used to select qualified patients with nonmetastatic colorectal SRCC from 2004 to 2015. Since data from the SEER database are publicly available and deidentified, no institutional review or informed consent from patients was required in this study.
Patients included in the present study should meet the following criteria: (1) patients histologically diagnosed with colorectal SRCC between 2004 and 2015 based on the International Classification of Diseases in Oncology (ICD-O-3) (ICD-O-3: 8490); (2) patients aged 18 years or more; (3) patients were subjected to active follow-up whose cancerspecific survival (CSS) was no less than 1 month; (4) colorectal SRCC should be the only or first primary malignancy; and (5) patients without distant metastasis and (6) TNM stage should be available. Based on the above-described inclusion and exclusion criteria, 2050 eligible patients were finally included in our study ( Figure 1).

Variables and Outcomes.
In this study, patients were divided into two groups according to their survival time. LTSs referred to patients whose CSS was longer than 5 years, while patients had CSS no more than 5 years were defined as non-long-time survivors (NLTSs). CSS was defined as the duration from initial tumor diagnosis to death caused by colorectal SRCC.
The baseline characteristics of patients were extracted from the SEER database for analysis. Age at diagnosis was categorized into four groups, namely, ≤40, 41-55, 56-70, and>70 years. Race was recorded as black, white, and other (mainly including American Indian, Asian, and Pacific Islander). Sex was recorded as male and female. Marital status included married and unmarried, and the latter included single, divorced, separated, and widowed. Tumor grade was recorded as well-differentiated/moderately differentiated and poorly differentiated/undifferentiated. With respect to tumor size, colorectal SRCCs were classified into ≤4 cm, 4.1-6 cm, and >6 cm. The primary tumor location was divided into right colon, left colon, and rectum. Right colon consisted of appendix, cecum, ascending colon, hepatic flexure, and transverse colon; left colon consisted of splenic flexure, descending colon, and sigmoid colon, and rectum consisted of rectosigmoid junction and rectum [13]. According to clinical guidelines, a minimum of 12 lymph nodes should be examined for adequate staging and prognostic assessment in CRC [14]. Thus, the number of sampled lymph nodes was divided into <12 and ≥12. T stage and N stage were clearly classified based on the SEER registry. In terms of primary tumor treatment, we divided patients into local tumor excision, surgery, and no primary tumor treatment. For chemotherapy and radiation, patients were categorized into two groups, namely, yes and no/unknown.

Construction and Assessment of Nomogram.
Patients were randomly assigned into the training cohort (N = 1466 ) and validation cohort (N = 584) by setting seed in the R software (training cohort: validation cohort = 7 : 3). Both univariate and multivariate logistic regression analyses were performed to identify independent predictors of LTSs. Afterwards, five independent risk factors (including race, marital status, T stage, N stage, and primary tumor treatment) were utilized to construct a nomogram model to predict LTS.
The calibration and discrimination capacities of the nomogram-based LTS prediction were assessed by a calibration plot in both training and validation cohorts. Besides, the C-index was also calculated. The receiving operator characteristic (ROC) curve and area under the curve (AUC) were further adopted and used to assess the predictive accuracy of the nomogram model.

Statistical Analysis.
The chi-square test was used for comparison between LTSs and non-LTSs. Univariate logistic regression analysis was conducted to identify possible predictors of LTSs in nonmetastatic colorectal SRCC, and variables with P value < 0.05 in the univariate analysis were further analyzed by the multivariate model. Results were displayed as odds ratio (OR) with 95% confidence intervals (CIs). Kaplan-Meier method was used to plot survival curves, and log-rank test was employed to determine the statistical significance between groups. The SPSS statistics version 26.0 software (SPSS Inc., Chicago, United States) and R version 3.6.1 software (R Foundation for Statistical Computing, Vienna, Austria) were adopted for statistical 2 Gastroenterology Research and Practice analysis. A two-sided P value < 0.05 was considered as statistical significance.

Baseline Characteristics of Patients.
According to the inclusion and exclusion criteria (Figure 1), 2050 eligible patients were enrolled in our study, consisting of 1441 non-LTSs and 609 LTSs. When comparing the clinicopathological characteristics between non-LTSs and TLSs, we found that although age distribution was statistically significant between the two groups, there was no clear trend indicating the possible association between age and longer survival. Regarding race distribution, higher proportion of white patients was detected in the LTS group (86.37%) than that in the non-LTS group (80.01%). Sex proportion was not significantly different between the LTS group and non-LTS group (P = 0:148). With respect to marital status, there were significantly more married patients in the LTS group (60.92%) than those in the non-LTS group (51.77%). For tumor grade, not surprisingly, well differentiation and moderate differentiation accounted for a higher proportion in the LTS group than the non-LTS group. Tumor size ≤ 4 cm was relatively more common in the LTS group than the non-LTS group. For primary tumor location, patients of the LTS group had a higher proportion of right-sided colon cancer. There was no statistical difference of the number of sampled lymph nodes between the two groups. Regarding TNM stage, advanced T stage and N stage were definitely more common in the non-LTS group. In terms of tumor treatment, more patients underwent surgical resection or local tumor excision in the LTS group. However, relatively less patients received chemotherapy or radiation in the LTS group (Table 1).    (Table 2). Interestingly, we found that marital status was an independent predictor. Thus, patients were divided into married and unmarried groups based on their marital status. We later performed stratified analysis to investigate the association between marital status and LTSs. As shown in Table 3, marital status was significantly associated with LTSs in the majority of subgroups.

Construction and Validation of Nomogram Model.
According to the multivariate logistic regression analysis, race, marital status, T stage, N stage, and primary tumor treatment were incorporated into a nomogram model to assign the probability of LTS in a specific individual. As shown in Figure 2, the performance of primary tumor treatment had the largest effect on the possibility of LTS, with a maximal score of 100. Other variables had varied effects on the probability of LTS.
The nomogram showed good accuracy in predicting LTS in the training cohort, with a C-index of 0.715 (Figure 3(a)). The calibration plot showed good agreement between the model predictions and actual observations for LTS (Figure 3(a)). Similarly, the C-index was 0.704 for the nomogram-based LTS prediction in the validation cohort ( Figure 3(b)). As expected, the calibration curve showed good consistency of observed LTS probability with predicted LTS probability. Finally, ROC curve was adopted to assess the predictive power of the nomogram-based prediction model for LTS probability. As a result, the AUC was 0.715 and 0.704 in the training cohort and validation cohort, respectively ( Figure 4).

Discussion
SRCC, a special histology of malignant tumors, is often found in the stomach and less common in other organs. Colorectal SRCC is a rare subtype of CRC, which consists of 0.1% to 2.6% of all CRC cases [6,15]. Previous studies have revealed a female predominance in colorectal SRCC [16,17], which is similar with that of gastric SRCC. Moreover, a younger age of onset has been reported in colorectal SRCC than differentiated colorectal adenocarcinoma [18,19]. In terms of tumor location, several studies have reported that right colon is most commonly affected by colorectal SRCC [20,21], because right-sided colon cancer has a higher incidence of microsatellite instability (MSI)-high, BRAF mutation, and CpG island methylation phenotype (CIMP)high than that of left-sided colon cancer [22]. Thus, colorectal SRCC is a distinct entity compared to common colorectal adenocarcinoma. In consideration of the poor prognosis and aggressive tumor biology of colorectal SRCC, it is critical and intriguing to investigate the characteristics of patients who survive for a long time.
To the best of our knowledge, the present study was the first one to analyze the clinicopathological characteristics and to identify the independent predictors of LTSs in nonmetastatic colorectal SRCC. According to the multivariate logistic regression analysis, we found that white race, married status, less advanced T stage, negative lymph node metastasis, and primary tumor treatment (including radical surgery and local tumor excision) were significantly independent predictors of LTSs. Based on these results, we constructed a nomogram to predict LTSs, which is an easy-to-use and visual tool for clinical use. As shown in Figure 2, primary tumor treatment exerted the largest impact on the possibility of being LTSs, indicating the significant role of surgery in localized or locally advanced colorectal SRCC [23], especially in the era of multidisciplinary treatment of colorectal SRCC [24]. Other parameters (including race, marital status, T stage, and N stage) had relatively smaller effects. As a user-friendly statistical method, nomogram model could provide the possibility of being TLSs by formula calculation [25]. This nomogram-based model could assist clinicians to distinguish from highand low-probability LTSs in nonmetastatic SRCC. For instance, when a black (0 point), married (20 points) patient with T1N1M0 (40 points for T1 and 22 points for N1) colorectal SRCC who received radical surgery (98 points), his possibility of surviving over 5 years is approximately 0.4 (180 points in total). Intriguingly, we revealed that married patients with nonmetastatic colorectal SRCC were more likely to survive for a long time in the present study, indicating that marital status is a significantly prognostic factor. It is reasonable that spouse and family support plays a positive role in antitumor treatment and tumor surveillance [26]. Feng et al. have also revealed the similar findings [27], who suggest that the distress and psychological burden following tumor diagnosis could be shared and relieved by spouse support [28,29]. Further stratified analyses of marital status and LTSs suggest that marital status is significantly associated with LTSs in    (Table 3). Consistently to most studies [18,30], we revealed a proximal colon dominance for colorectal SRCC in our study (N = 1282, 62.5%).

Gastroenterology Research and Practice
Apart from these common and available clinicopathological factors analyzed above, recent studies have also revealed molecular factors that are associated with patient prognosis in nonmetastatic colorectal SRCC. Some authors suggest that colorectal SRCC may arise from a separate genetic pathway compared to common adenocarcinoma [31]. RAS/RAF/MAPK signaling is an important signaling pathway in the colorectal carcinogenesis. BRAF mutation is definitely associated with poor prognosis and resistance to the anti-EGFR treatment in CRC [32]. BRAF mutations have been reported to be common in SRCC, which can be as high as 20% [33]. In addition, BRAF mutation is significantly associated with CIMP positive status, with a relatively high incidence of MSI-H phenotype (24-48%) in colorectal SRCC [34,35].
Colorectal SRCC is a distinct subtype of colorectal cancer. According to previous reports, colorectal SRCC presents 8 Gastroenterology Research and Practice as high-grade carcinoma and is more commonly associated with lymphatic invasion, vascular invasion, perineural invasion, and more advanced tumor stage [36,37]. Besides, SRCC is a significantly prognostic factor for CRC [37]. In a recent nomogram predicting the overall survival of nonmetastatic colon cancer, SRCC is a significant predictor of poor prognosis [38]. Therefore, it is also intriguing to investigate the different predictors of LTSs between common colorectal adenocarcinoma and colorectal SRCC.
There are several limitations that should be discussed in our study. First, the intrinsic selection biases are unavoidable in this retrospective study. Second, the performance of chemotherapy and radiation is divided into two groups, namely, "yes" and "no/unknown." However, we are unsure about the effects of radiochemotherapy on the long-time survival of patients, although our present results indicate negative impacts. Third, the model performance is overall acceptable in our study, both in the training cohort and validation cohort. However, external validation is still required to confirm the clinical application of our nomogram model.

Conclusion
To sum up, in this population-based study, we analyzed the clinicopathological characteristics of LTSs with nonmetastatic colorectal SRCC. Additionally, we also revealed several independent predictors of LTSs (including race, marital status, T stage, N stage, and primary tumor treatment) and further constructed a nomogram-based model for predicting the probability of LTSs, which showed acceptable performance in the training and validation cohorts.

Data Availability
Data are available from the corresponding author upon reasonable request.

Ethical Approval
Since SEER database is publicly available and reidentified, approval was waived by the local ethics committee (the Affiliated People's Hospital of Ningbo University) in this retrospective study.

Consent
Written informed consent is not required in this retrospective analysis.

Conflicts of Interest
All authors declared no conflict of interest.