Nomogram for Predicting the Relationship between the Extent of Visceral Pleural Invasion and Survival in Non-Small-Cell Lung Cancer

Objective Although visceral pleural invasion (VPI) has already been incorporated into the TNM staging system, few studies have been conducted to evaluate the prognostic value of the extent of VPI for the survival of non-small-cell lung cancer (NSCLC) patients. Thus, we utilized the Surveillance, Epidemiology, and End Results (SEER) database to assess the correlation between the extent of VPI and survival in NSCLC. Methods We identified and incorporated the extent of VPI to build a prognostic nomogram in this study. Patients in the SEER database diagnosed with NSCLC (n = 87,045) from 2010 to 2015 were further analyzed and randomly assigned into either the training group (n = 60,933) or validation group (n = 26,112). Clinical variables were calculated by means of multivariate Cox regressions and incorporated into the predictive model. Subsequently, the accuracy and discrimination of nomogram were further assessed through the concordance index (C-index), calibration curves, and Kaplan–Meier curves. Results Multivariate analysis demonstrated that the extent of visceral pleural invasion was an independent and unfavorable prognostic factor. The C-indexes of the training and validation groups were 0.772 (95% CI: 0.770–0.774) and 0.769 (95% CI: 0.765–0.773), respectively, which revealed that the nomogram had sufficient credibility and stable predictive accuracy. The calibration curve displayed consistency between the actual and predictive values in both training and validation groups. Conclusion The prognostic nomogram with the extent of VPI could offer an accurate risk evaluation for patients with NSCLC. Independent external validation of this research should be conducted in the future.


Introduction
Lung cancer is a group of clinically and histologically heterogeneous diseases and is the leading cause of cancer-related mortality worldwide [1]. Lung cancer is diagnosed in approximately 2.1 million people annually worldwide, with about 1.8 million people die of this disease [2]. Among patients, approximately 85% of patients were classified as having non-small-cell lung cancer (NSCLC). e 5-year survival rate for NSCLC patients ranges from 4% to 17% due to stage and geographic differences [3].
Visceral pleural invasion (VPI) has been identified as a poor prognostic factor in NSCLC and was first adopted as a T descriptor in the 5th edition of the TNM classification criteria in the 1970s [4]. According to the International Association for the Study of Lung Cancer (IASLC) standard classification, PL0 has been defined as there is no evidence of visceral pleural invasion surpassing the elastic layer; PL1 has been identified as the invasion surpassing the elastic layer; PL2 represents the invasion to the pleural surface; and PL3 indicates the invasion to the parietal pleural, thoracic wall, or both [5,6]. In the recently updated 7th and 8th editions of the American Joint Committee on Cancer (AJCC) staging system, VPI regarded as a non-size-based T2 factor, which was previously determined to be T1 stage, is now considered to be T2a stage [6,7]. Although the prognostic value of VPI has been widely recognized, some studies have argued that the extent of VPI has a limited impact on the survival of lung cancer patients [8,9]. It remains controversial whether the extent of VPI affects the survival of NSCLC patients.
We must assess clinicopathological prognosis factors, including histological classification, tumor differentiation, tumor size, and so on, to estimate the accurate prognosis and deliver personalized treatment strategies for NSCLC patients. How to determine the subdivision level of VPI has some limitations in clinical application, which is usually due to being diagnosed by imaging features and clinical judgment rather than pathological diagnosis. VPI in NSCLC patients is usually determined by the tumor site, radiographic findings, and other clinical characteristics; therefore, patients have variable clinical outcomes. Moreover, although VPI has been incorporated into the existing TNM staging system of AJCC, the relationship between the subdivision of VPI and prognosis in the NSCLC patients continues to have a great deal of ambiguity. Based on these reasons, we are proposing the use of additional prognostic models to evaluate the survival outcomes of these patients.
Nomography, as a prognostic tool, can integrate predictive factors to establish a statistical model and has been extensively applied to predict the survival outcomes of various cancer patients [10,11]. So far, no relevant nomogram involving the subdivisions of VPI has been established to predict prognosis in NSCLC patients. erefore, we aimed to establish a nomogram that can visualize the survival prediction results of NSCLC patients with different extents of VPI by analyzing data from the Surveillance, Epidemiology, and End Results (SEER) database.

Data Abstraction.
e SEER database, which has primary sources of population-based cancer statistics in the USA, capturing approximately 97% of cancer incidence and covering about 28% of the USA population in 17 SEER registries, is maintained by the National Cancer Institute (NCI) [12] (https://seer.cancergov/). We extracted the data of non-small-cell lung cancer patients from the SEER database using the SEER * Stat program (v 8.2.1). e classification of VPI was based on the proposal of the International Association for the Study of Lung Cancer (IASLC) [5]. According to the SEER data term, CS Site-Specific Factor 2, the subdivision level of VPI confirmed by pathology was introduced in this database in 2010. erefore, we retrieved only the data of lung cancer patients diagnosed by pathology from 2010 to 2015 and initially obtained a total of 443,960 patients. e included criteria to further screen the patients were as follows: (1) all included subjects needed to be histopathologically confirmed as NSCLC; (2) histological subtypes, including squamous cell carcinoma, adenocarcinoma, large-cell carcinoma, and bronchioalveolar carcinoma, were assigned according to the World Health Organization (WHO) classification system; (3) definite diagnosis of NSCLC was primary and unique; (4) the degree of visceral pleural invasion and survival status was definite; (5) comprehensive clinical information, such as race, sex, age, marital status, T stage, N stage, M stage, tumor grade, survival month, vital status, and the extent of VPI, were recorded. e excluded criteria were as follows: (1) incomplete clinical data involving survival time; T, N, and M stage; and the subdivision level of VPI and (2) histologically confirmed as small-cell lung cancer. e eligible data of NSCLC patients were randomly divided into either the training group (n � 60,933) or the validation group (n � 26,112) at a ratio of 7:3.

Variable Definition.
e study variables were recorded as follows: gender (female and male), race (black, white, and others), Tstage (T0, T1, T2, T3, and T4), N stage (N0, N1, N2, and N3), M stage (M0 and M1), tumor grade (well-differentiated, Grade I; moderately differentiated, Grade II; poorly differentiated, Grade III; and undifferentiated, Grade IV), histologic classification (adenocarcinoma, squamous cell carcinoma, large-cell carcinoma, and bronchioalveolar carcinoma), marital status (married (including common law), divorced, single (never married), widowed, and unknown), the extent of VPI (no evidence of visceral pleural invasion based on clinical and/or pathological judgment, PLx; tumor confined within the lung parenchyma or does not fully invade beyond the elastic layer based on histopathological evidence, PL0; tumor penetrates beyond the elastic layer based on histopathological evidence, PL1; tumor penetrates to the surface of the visceral pleura based on histopathological evidence, PL2; and tumor extends to the parietal pleura based on histopathological evidence, PL3). e primary end point was overall survival (OS), which was calculated starting from the date of diagnosis to death by any cause.
We obtained permission to retrieve the SEER research data files with the reference number 16828-Nov2017. e research data used in this study did not involve the subjects or individual identification information. erefore, this study did not require informed consent and ethical approval.

Statistical Analysis.
We used R software version 3.5.1 to perform statistical analysis and generate graphics (R Foundation for Statistical Computing, Vienna, Austria). e continuous variables were converted to categorical variables and calculated using the Chi-squared test. e survival curve for OS was displayed using the Kaplan-Meier method, and differences were tested using a log-rank test stratified based on the prognostic factors.
e Cox proportional hazards multivariate regression was used to analyze further the variables with P-values less than or equal to 0.05, which had been calculated using the univariate analysis. e graphical nomogram was obtained using the logistic regression model from the training group using the R package rms. e maximum points for each variable were set to 100. e discrimination and predictive powers of the nomogram model were evaluated using the concordance index (C-index) [13]. e discrimination ability of the prognosis model gradually increased with the increase in scores, the value of the C-index with 0.5 representing a random chance and 1.0 representing a fully corrected discrimination ability. e calibration plots of the nomogram for 3-and 5-year OS, which used bootstraps of 200 resamples, were constructed to evaluate the consistency between the predictive and actual survivals. A two-tailed P-value of less than 0.05 was considered statistically significant.

Characteristics of Patients.
is study initially included a total of 87,045 lung cancer patients from the SEER research database from 2010 to 2015 that subsequently was randomly divided into either the training group (n � 60,933) or the validation group (n � 26,112) at a ratio of 7:3. e selection process is presented in a detailed flow chart (Figure 1), and the demographic characteristics are listed in Table 1. Among these patients, 42,199 (48.5%) were female, and 44,846 (51.5%) were male. e majority of the initially included patients were married elderly patients (>60 years) and had adenocarcinoma and a tumor grade of III.

Prognostic Factors Associated with OS.
e variables of the training group including gender, age, race, marital status, histology classification, tumor grade, TNM stage, and the subdivision of the VPI, were incorporated into the univariate analysis. All selected variables with a P-value of less than 0.05 using the univariate analysis were determined as risk factors and were subsequently further analyzed using Cox proportional hazards multivariate regression. Ultimately, the analysis results showed that these variables, obtained by the univariate and multivariate analysis, were independent prognostic factors ( Table 2).

Nomogram Construction and Validation.
Based on these results, these factors were determined to be independent prognostic factors and then incorporated into the construction of the nomogram predicting 3-and 5-year overall survival (OS) in the training group ( Figure 2). is nomogram predictive model revealed that M stage had the biggest impact on prognosis, followed by the extent of VPI, histology type, T stage, tumor grade, N stage, age, gender, pleural, and race. e score of each prognostic factor was identified using the point scale drawn by the intersection of the vertical line from each variable to the point axis. Next, the 3-and 5-year survival probabilities were acquired by adding the score of each prognostic factor. Higher scores among patients correlated with decreased survival. e C-index was 0.772 (95% confidence interval (CI): 0.770-0.774) in the training group and was 0.769 (95% CI: 0.765-0.773) in the validation group.
ese results ultimately displayed adequate discrimination ability in the prediction of NSCLC patients' absolute risk. e calibration curves, which were validated using the bootstrap resampling method, indicated that the predictive power of the nomogram was in accordance with the actual observed values in both groups (Figure 3). e correlation between the predictive power of OS and the different subdivision levels of VPI shown in Figure 4, which was depicted using Kaplan-Meier curves, could display the survival differences in both the training and the validation groups.

Discussion
e incidence rate of VPI accounts for approximately 11.5% of NSCLC patients and varies between different histological types [14]. According to the 7th edition of the AJCC/UICC TNM staging system, stage T1a (≤2 cm) and stage T1b (>2 cm and ≤3 cm) can be classified based on the tumor size, whereas VPI confirmed by the pathology finding in NSCLC patients with stage T1 (≤3 cm) can be upgraded to stage T2a. Nonetheless, the detailed subdivision levels of VPI are not incorporated into the Tstage in the TNM staging system (8th edition). Some [15][16][17] but not all [18,19] studies showed that NSCLC patients with VPI had a worse prognosis than those without VPI. Moreover, due to the presence of an independent risk factor for visceral pleural invasion, the relationship between tumor staging and tumor sizes remained controversial. e application of pleural biopsy or medical thoracoscopy is the optional method to estimate the extent of VPI. e clinician often considers various limiting factors such as the increasing risk of malignancy, expertise, surgical complication, and so on. us, the extent of VPI can be predicted by imaging examinations, such as B-ultrasound, computed tomography, and other methods. However, the single prediction method and the increase of medical cost may influence diagnostic accuracy and wide clinical practice. Combination with other simple and effective methods has meaningful clinical application value.
erefore, we performed this study to establish a simple predictive tool that can assist clinicians to make the preliminary screening.
Nomograms are prognostic tools that make complex statistical models simpler with terse diagrams, which supply more exact and understandable prognosis predicting results and are widely applied in clinical practice [20,21]. Predictive models based on large sample data from the SEER database might be less prone to the bias that is selected by the null Canadian Respiratory Journal hypothesis. Meanwhile, due to the lack of details of population-based data, the analyses of the nomogram model derived from the SEER database might be more accurate and more likely to conquer bias through institutional practice [22]. Many clinical research works [23][24][25][26] established nomogram prognostic models to provide survival counseling and follow-up strategy-making. ese results demonstrated the efficiency of the nomogram predictive tool in clinical practices. Considering the friendly clinician-oriented interface and accurate predictions, therefore, we performed nomograms, which were derived from the SEER database, to predict the correlation with the subdivision level of VPI and the OS of NSCLC patients. We selected the independent prognostic factors using univariate and multivariate Cox proportional hazards regressions to construct the nomogram model. e OS, defined as the time from diagnosis to death of any cause, was assessed by the Kaplan-Meier method and tested by the log-rank test. Consistent with many previous studies [15,18,27], the negative correlation between the subdivision level of VPI Many clinical studies have shown that NSCLC patients with VPI have poorer survival as compared with those without. e possible factors for poor prognosis are associated with higher involvement in mediastinal lymph nodes [28][29][30]. On the other hand, lung cancer cells under the pleural are more likely to invade the pleural layer rapidly through the flow of pleural effusions in the pleural cavity. Once diaphragmatic lymph nodes have been involved, malignant cells will further be drained to mediastinal lymphatic vessels. In this way, cervical venous circulation is easier to be invaded in the spreading of malignant cells and facilitates metastasis. Malignant tumor cells have increasingly aggressive and progressive biological characteristics with the increasing level of VPI and contribute to adverse outcomes for these NSCLC patients. Some studies have revealed the link between marital status and survival in cancer patients. Meanwhile, several potential mechanisms may explain the correlation. First, patients who are married have less distress and depression than unmarried patients after a diagnosis of cancer, as a partner can share the emotional burden and provide the appropriate social support. Chronic stress, loneliness, and depression can downregulate the immune responses; stimulate tumor angiogenesis; and increase tumor burden and invasiveness [31][32][33][34]. Second, patients with emotional and financial support from their spouses or children had better compliance from doctors [35].
ere are several limitations in this population-based study. All of the data from the SEER database, which has a retrospective nature, contain bias that should be taken into consideration. First, the different medical agencies and professionals, including pathologists, surgeons, and so on, influence the detection rates and the quality of pleural invasion. Moreover, the SEER database fails to provide other information regarding NSCLC patients with VPI including the method of detection, complications, pulmonary function, and treatment method, which could generate underlying bias as well. us, it is useful to improve the accuracy of predictive models by incorporating novel predictors and introducing competing risk models. Moreover, the inclusion of the degree of VPI of early-stage NSCLC patients might also be a good candidate to control the confounding factors. Second, it is a common issue that improved accuracy of the predictive model is usually accompanied by a compromise between the increasing complexity of predictive factors and the decreasing understandability of the model during the modeling process of the nomogram. Considering the aforementioned, variables of clinical importance and high repeatable practicability would be preferred. Moreover, the nomogram itself needs to be confirmed using calibration plots and the C-index due to its uncertainty. Nonetheless, the nomogram is a powerful supplement for clinician judgment and clinical decision-making. ird, occult pleural 3 metastases, which cannot be assessed by routine pathological examination, can only be detected during thoracotomy. Considering this study's retrospective nature, the selection bias could not be avoided. In this study, whether patients diagnosed as PL0 actually have occult pleural metastases is still unclear. Besides this, patients who displayed no signs of pleural invasion based on clinical and/or radiographic judgment cannot exclude the possibility of occult micrometastases. Risk factors of these patients were relatively higher than those diagnosed with VPI. Fourth, based on the SEER database, we randomly divided the data into the training and the validation groups at the ratio of 7:3. is method of nomogram construction and validation is common, whereas further external verification was not available. Hence, we will focus on the data from multiple medical centers in further research to perform the validation of the external cohort.

Conclusion
We developed and validated a population-based nomogram model to predict survival differences among the different subdivision levels of VPI in NSCLC patients.
is study provides a novel perspective that helps clinicians determine survival prognosis and establish personalized treatment strategies for NSCLC patients with varying degrees of VPI, which is an effective supplement to traditional TNM staging.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.
Ethical Approval e authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. SEER database is the leading cancer statistics database in the United States, and it is globally open and shared. e authors obtained the data by applying to the authorities.

Conflicts of Interest
e authors declare that they have no conflicts of interest.