Identifying Clinicopathological Risk Factors of the Regional Lymph Node Metastasis in Patients with T1-2 Mucinous Breast Cancer: A Population-Based Study

Background Pure mucinous breast cancer (PMBC) has a better prognosis than other types of invasive breast cancer. However, regional lymph node metastasis (LNM) might reverse this outcome. We aim to determine the independent predictive factors for regional LNM and further develop a nomogram model for clinical practice. Method Data of PMBC patients from the Surveillance, Epidemiology, and End Results (SEER) program between Jan 2010 and Dec 2015 were retrospectively reviewed. Univariate and multivariate logistic regression analyses were used to determine the risk factors for LNM in T1-2 MBC. The nomogram was constructed and further evaluated by an internal validation cohort. The receiver operating characteristic (ROC) curves, decision curve analysis (DCA), and calibration curves were performed to evaluate the accuracy of this model. Result Five variables, including age, race, tumor size, grade, and breast subtype, were identified to be significantly associated with regional LNM in female patients with T1-2 PMBC. A nomogram was successfully established with a favorable concordance index (C-index) of 0.780, supported by an internal validation cohort with a C-index of 0.767. Conclusion A nomogram for predicting regional LNM in female patients with T1-2 PMBC was successfully established and validated via an internal cohort. This visualized model would assist surgeons to make appropriate clinical decisions in the management of primary PMBC, especially in terms of whether axillary lymph node dissection (ALND) is warranted.


Introduction
Nowadays, the prevalence of breast cancer is rapidly increasing and has become the leading cancer with the highest incidence rate for women worldwide, according to the latest report from the American Cancer Society [1]. Compared with not otherwise specified (NOS) invasive ductal carcinoma (IDC), pure mucinous breast cancer (PMBC) is a pathologically and genetically distinct mammary neoplasm, representing 1-4% breast cancer and containing a relatively good prognosis [2][3][4][5]. However, as one pivotal prognostic determinant, axillary lymph node metastasis (ALNM) was still determined to be negatively correlated with the longterm survival of PMBC [6,7]. erefore, the regional lymph node status is crucial for clinicians to make appropriate treatment decisions for this special type of breast cancer.
Undoubtedly, sentinel lymph node biopsy (SLNB) was regarded as an efficacy intraoperative strategy for predicting the potential ALNM in patients with clinically node-negative (cN0) breast cancer. It could not only guide the surgeons to measure the necessity of axillary lymph node dissection (ALND) but further help to reduce the postoperative complications. Moreover, the feasibility and accuracy of SLNB results after neoadjuvant chemotherapy (NAC) in patients with initial node involvement were determined in recent meta-analysis [8]. And the SLNB was even sufficient and reliable for patients with initial biopsy-proven nodepositive breast cancer but converted to negative after NAC [9]. However, the false-negative (FN) result of SLNB was inevitable, especially in the presence of a small primary tumor with a single nodal metastasis [10,11]. It was deemed to be a potential problem in the completeness of surgical dissection procedures. e FN rate, accounting for approximately 2%-27% in different studies [8,10], was significantly associated with the numbers of examined sentinel lymph nodes. Notably, Moo et al. conducted that micrometastases or isolated tumor cells (ITCs) after NAC were not an indicator of exempting for additional ALND, even when not detected on intraoperative SLNB [12]. us, weighing the potential risk of ALNM which was not intraoperatively detected via the SLNB, more indicators were demanded for comprehensively predicting the regional lymph node metastasis (LNM) in patients with breast cancer.
With the popularization of breast mammography and ultrasound, an increasing number of female patients with small primary breast cancer were screened out and clinically diagnosed. Despite the relatively small primary tumor focus (T 1-2 ), defined as a maximum diameter less than or equal to 50 mm, considerable patients still suffered from regional metastasis, bone metastasis, and even visceral metastases at initial diagnosis [13]. Moreover, only a small number of listed studies are focused on investigating the risk factors for predicting the prognosis in patients with PMBC [6,7,14].
In the present study, we hereby aim to explore the clinicopathological risk factors of promoting regional LNM and further construct a new nomogram for predicting this event in female patients with T 1-2 primary PMBC, which would assist clinicians to preoperatively identify high-risk patients and make better individualized surgical decisions.

Data Source.
e data we analyzed were extracted from the Surveillance, Epidemiology, and End Results (SEER) database, derived from the 18 cancer registries across the United States of America (USA), covering approximately 28% of incident cases of the whole country (http://seer. cancer.gov) and included various ethnic groups.
Patients who met the following criteria were included: (1) Female patients between the age of 20 and 84 years.
Patients with no regional node examined, presence of distant metastasis, or coexisting with one or more cancers were excluded during the study period.

Data Analysis.
After excluding unqualified patients, there were 3,111 patients with PMBC in the SEER program enrolled in this study. e patients diagnosed between 2010 and 2013 were designed as the training group and patients diagnosed between 2014 and 2015 were designed as the validating group, respectively. e following clinicopathological characteristics were collected and transformed into categorical variables: age, race, gender, laterality, grade, location, size, histological type, and the number of regional nodes examined and positive nodes.
Univariate and multivariate regression analyses performed by IBM SPSS (version 25.0) were used to identify the independent risk factors in patients. A two-tailed p-value of <0.05 was defined as the criterion for variable deletion when performing backward stepwise selection. e development and validation of nomograms were based on the results of the multivariate logistic regression analysis using the rms package of the R software (R Foundation, Vienna, Austria, version 3.5.2, http://www.r-project.org). Harrell's C-index, which is equivalent to the area under the ROC curve, is calculated to assess the discrimination performance of the present nomograms.

Baseline Characteristics.
A total of 3,111 female patients with PMBC who met the inclusion criteria were enrolled in this study. All the patients were histologically diagnosed with primary PMBC and the maximum diameter of the tumor was less than or equals to 50 mm (T 1-2 ). Overall, the regional LNM was identified in 240 (7.71%) cases of the initial cohort, including 152 (8.06%) of the training group and 88 (7.18%) of the validation group, respectively. Besides, among the training and validating cohorts, a majority of patients were white (76.09% and 73.31%, resp.). Moreover, luminal A (94.09%) was the predominant tumor subtype of PMBC in this study, compared with other tumor subtypes including luminal B (4.73%), TNBC (0.55%), and HER2 enriched (0.64%). e specific demographic and clinical characteristics of the patients in the training and validation datasets were summarized in Table 1.

Univariate and Multivariate Analyses of the Risk of Regional LNM.
Attempting to determine the predictive factors of regional LNM in female patients with T 1-2 PMBC, eight variables including age, race, tumor size, location, differentiation grade, laterality, and tumor subtype were initially analyzed in univariate analysis ( Table 2). Five variables which were significantly different (age, grade, size, race, and tumor subtype) were obtained by univariate analysis (all p < 0.05) and were analyzed by multivariate logistic regression analysis. e risk factors which were significantly associated with regional LNM were as follows: aged under 45 years (p � 0.001), tumor size (5 mm < largest diameter ≤ 10 mm, odds ratio (OR) � 1.04, 95% confidence interval (CI): 0.19-5.54; 10 mm < largest diameter ≤20 mm, OR � 6.12, 95% CI:  15; p � 0.001). In addition, TNBC patients had a higher risk of regional LNM compared with luminal A type of patients (OR � 3.06, 95% CI: 0.73-12.86; p � 0.029). However, there were neither significant differences in the tumor location nor laterality for predicting the risk of regional LNM (p 1 � 0.74, p 2 � 0.80, resp.).

Predictive Nomogram Construction and Validation.
Based on the analysis results of multivariate logistic regression, the independent variables including age, race, grade, tumor size, and tumor subtype were screened out for establishing a visualized nomogram to predict regional LNM in female patients with primary T 1-2 PMBC (Figure 1). e concordance index (C-index), which was equivalent to the AUC (area under the curve) of ROC, was 0.783 (Figure 2(a)). Moreover, to validate the utility of our nomogram, an internal validation cohort by using data (1,225 cases) between 2014 and 2015 from the SEER program was subsequently constructed, rendering a similar C-index of 0.767 (Figure 2(b)), which indicated an optimistic outcome of our nomogram in predicting the regional lymph node involvement in female patients with primary T 1-2 PMBC.
To further evaluate the predictive ability of the nomogram, a decision curve analysis (DCA) was performed both in training and in internal datasets. e standardized net benefits of the models were comparable, and there was a significant overlap between these models. Namely, the DCA showed that the prediction ability of the nomogram was more effective than a treat-none or treat-all strategy when the threshold probability ranged from 0.05 to 0.5 ( Figure 3). Furthermore, a calibration curve of the regional LNM risk Journal of Oncology 3 nomogram in female patients with PMBC was also displayed. e result suggested a great agreement in the training data set, with a mean absolute error � 0.006 ( Figure 4).

Discussion
Currently, breast cancer is the leading malignancy among women, with the highest incidence rate worldwide, especially in the United States (accounting for approximately 30% of new cases) [1]. Although great advances have been made in therapeutic modalities, including but not limited to surgical techniques, adjuvant chemotherapy, radiotherapy, and even immunotherapy for delaying disease progression and improving the long-term prognosis, early diagnosis, and systemic preoperative evaluation remained to be the crucially important steps for these patients.
During the past years, a lot of research has been done on determining the risk factors for lymph node involvement and survival in patients with different subtypes of breast cancer or coexisting with distant metastasis at initial diagnosis [13,[15][16][17][18][19]. As IDC and invasive lobular carcinoma (ILC) accounted for the vast majority of cases, the clinicopathological characteristics of these two types of breast cancer were the main object of intensive research. For instance, Wang et al. established a nomogram for predicting the prognosis of female patients with breast cancer and bone metastasis at presentation [19] and Cui et al. [15] established a nomogram for predicting the LNM in TNBC patients. On the contrary, PMBC, as a rare histologic type of mammary neoplasm, accounting for 1-4% of all breast cancers, has rarely been investigated and usually was classified as "others" group in several studies [14,16,19,20]. Recently, increasing attention has been paid to the treatment modalities, especially the necessity of adjuvant chemotherapy, radiotherapy, and anti-HER2 therapy for this kind of malignancy [21][22][23][24][25]. Notably, the role of chemotherapy in PMBC was controversial. In two population-based studies from the SEER database and Korean Breast Cancer Registry [21,25], patients with PMBC could not benefit from chemotherapy during longtime survival and they further concluded that these patients could be exempt from chemotherapy. However, in one most recent published literature, Gao [26] conducted that early-stage HR (hormone receptor) positive PMBC patients could benefit from the adjuvant chemotherapy, especially in terms of having a better overall survival (OS) when compared with nonchemotherapy patients (p < 0.001).
Although several previous studies have determined that positive lymph node status was the most important prognostic factor which could affect and worsen the prognosis [6,7], there was still a lack of an effective  predictive model to evaluate the risk of regional LNM and further guide whether the ALND was appropriate during the surgical intervention in patients with PMBC.
In the present study, to the best of our knowledge, this was the first validated nomogram for predicting regional LNM in female patients with T 1-2 PMBC based on the  : Decision curve analysis for regional lymph node metastasis (LNM) in female patients with T1-2 pure mucinous breast cancer (EIscore) in the training cohort and internal cohort. e decision curve analysis graphically shows the clinical usefulness of the EI-score based on a continuum of potential thresholds for regional LNM and the net benefit of using the EI-score to stratify patients (y-axis). Net benefit � (true positives/N)−(false positives/N) * (weighting factor). Weighting factor � reshold probability/(1-threshold probability). Actual probability Figure 4: Calibration curves of the nomogram of training cohort for predicting regional lymph node metastasis (LNM) in female patients with T1-2 pure mucinous breast cancer (bootstrap 1000 repetitions). e x-axis represents the predicted regional LNM. e y-axis represents the actual LNM. e diagonal dotted line stands for a perfect prediction using an ideal model. e solid line represented the performance of the nomogram, of which the closer fit to the diagonal dotted line represents the better prediction of the nomogram we constructed. clinicopathological features. e regional LNM was determined in 7.71% of patients, which was lower than one study containing a larger sample size [6]. In the multivariate logistic regression analyses, age, tumor size, race, differentiation grade, and tumor subtype were significantly associated with regional LNM. Specifically, patients with younger age (<45 years), larger tumor size (>10 mm), black race, poor differentiation, HER2 enriched, or TNBC subtypes had a higher risk of regional LNM. ese results were partially consistent with a previous study on evaluating the LNM in patients with different types of breast cancer. By contrast, tumor location and laterality were not regarded as predictive factors in the regional LNM of the PMBC. Interestingly, some studies covering large-scale populations reported that the primary tumor location was strongly correlated with positive axillary lymph nodes, particularly located in the nipple, central breast, or axillary tail [20,27]. is different result might be attributed to the smaller study population and lower regional LNM rate in patients with PMBC when compared with IDC or other types of breast cancer [6].
In order to build a more convenient and visualized predictive model for clinical practice with the variables we determined above, a novel nomogram was successfully established. e risk of positive lymph nodes predicted by our nomogram ranged from 0.1 to 0.6. Besides, the C-index, which was in accordance with the AUC value in ROC of our nomogram was much higher than 0.70. It therefore indicates that our nomogram has sufficient discrimination ability. Moreover, the DCA results show that the nomogram we developed has a good clinical practical value. To further evaluate the feasibility of our nomogram, an internal validation cohort consisting of 1,225 female PMBC patients diagnosed between 2014 and 2015 years in the SEER database was performed. As expected, the predictive ability in the validation group was satisfied with a C-index of 0.767. Referencing similar work on predicting LNM in patients with different subtypes of breast cancer, our study took it a step further. For instance, while the study population in one nomogram constructed by Cui et al. [15] for predicting the LNM in TNBC patients was larger than ours, the C-index of the training set was only 0.684. Besides the nomogram for predicting the LNM in T 1 breast cancer developed by Zhao et al. [20], the C-index of the training group and validation group achieved 0.733 and 0.741, respectively, which were still weaker than ours. us, these results confirmed the utility of our nomogram in predicting regional LNM in patients with T 1-2 PMBC. Additionally, the patients were stratified into different risk subgroups according to the nomogram, and a higher prevalence of LNM was observed in high-risk subgroups. Nowadays, the comprehensive treatment modalities of PMBC were still controversial but worth further exploration [21,23,26]. is nomogram combined with other preoperative indicators [28] could not only help surgeons to decide whether ALND was appropriate for patients with T 1-2 PMBC but also offered an alternative way for stratification which could assist to select patients for adjuvant therapy.
Nonetheless, there were some limitations in our study, which we needed to clarify and address in the following research. First, this is a retrospective cohort study which may inevitably lead to some selected bias. Second, while the sample size of female patients with PMBC in our study was considered proper, yet it remains smaller than several studies on assessing the risk factors of regional LNM or long-term survival in patients with breast cancer [13,14,20]. ird, the vast majority of the study race is white (74.99%). For this reason, whether this nomogram could apply to other patients of different races and regions needs further exploration and external validation. Further prospective randomized controlled studies are needed to obtain more detailed strategies on the treatment of PMBC.

Conclusion
In summary, five clinical risk factors including age, race, tumor size, grade, and breast subtype, were identified to be significantly associated with regional LNM in female patients with T 1-2 PMBC. And a novel nomogram for predicting regional LNM in female patients with T 1-2 PMBC was successfully established, supported by the internal validation datasets. Our model could not only provide a more accurate reference for surgeons to better identify individuals at risk for regional LNM preoperatively but also help to make appropriate clinical decisions in the management of primary T 1-2 PMBC.