Clinical Characteristics, Outcomes, and Risk Factors for Patients with Diffuse Large B-Cell Lymphoma and Development of Nomogram to Identify High-Risk Patients

Objectives To analyse the clinical features, outcomes, and risk factors of patients with diffuse large B-cell lymphoma (DLBCL) in China, with the aim to establish a new prognostic model based on risk factors. Methods Clinical features and outcomes of 564 patients newly diagnosed with DLBCL from Jan 2009 to May 2017 were analyzed retrospectively. Variables were screened by LASSO regression and nomogram was constructed. Results The 5-year overall survival (OS) of the cohort was 75%. The 5-year OS of patients differentiated by International Prognostic Index (IPI) score was 90% (score 0–2), 73% (score 3), and 51% (score 4-5), respectively. Age > 60, Eastern Cooperative Oncology Group (ECOG) > 1, Ann Arbor stage III-IV, bone marrow involvement, low level of albumin (ALB), and lymphatic/monocyte ratio (LMR) were independent predictors of OS. The predictive model was developed based on factors including age, bone marrow involvement, LMR, ALB, and ECOG scores. The predictive ability of the model (AUC, 0.77) was better than that of IPI (AUC, 0.74) and NCCN-IPI (AUC, 0.69). The 5-year OS of patients in the low-, intermediate-, and high-risk groups identified by the new predictive model was 89%, 70%, and 33%, respectively. Conclusions The new prediction model had better predictive performance and could better identify high-risk patients.


Introduction
Difuse large B-cell lymphoma (DLBCL) is the most common histological subtype of non-Hodgkin lymphoma (NHL), accounting for approximately 25% of NHL cases [1]. Te disease is aggressive and requires aggressive medical intervention after diagnosis. Te International Prognostic Index (IPI) [2] has played an important role in determining the prognosis of patients with DLBCL over the past two decades. With the addition of rituximab to the CHOP or CHOP-like regimen, the prognosis of patients in each risk group according to IPI improved. New prognostic scoring systems, such as R-IPI [3] and NCCN-IPI [4], have emerged to better discriminate the survival of patients with DLBCL.
With the innovation of immunohistochemistry and molecular examination techniques and the optimization of treatment strategies, we need more accurate prognostic models to identify very high-risk DLBCL patients or biological heterogeneity to guide individualized treatment. Te application of Lasso regression [5] facilitated the selection of variables, and the use of nomograms [6] proved to be of better predictive value. In this study, we analyzed the clinical characteristics of patients with DLBCL and explored the factors infuencing survival, screened variables through Lasso regression, and constructed a nomogram to stratify the prognosis of patients with DLBCL.

Patients.
From Jan 2009 to May 2017, a total of 564 patients newly diagnosed with DLBCL according to WHO classifcation [7] by three specialized pathologists were included. Patients who did not have complete clinical and immunohistochemical data or who were diagnosed but not treated in our hospital were excluded (n � 65). Because primary central nervous system lymphomas are highly heterogeneous entities, we also exclude them from our study (n � 41). Baseline data were collected such as gender, age, B symptoms, performance status, number of extranodal involvement, presence of bulky disease (≥7.5 cm), Ann Arbor stage, cell of origin, lactate dehydrogenase (LDH), albumin (Alb), white blood cell count (WBC), hemoglobin (HGB), platelet (PLT), absolute lymphocyte count (ALC), absolute monocyte count (AMC), the ratio of lymphocyte to monocyte (LMR), and D-dimer. All patients in the cohort were routinely evaluated by lumbar examination before treatment. Bone marrow examination was performed before treatment to determine whether there was bone marrow involvement, and the efcacy was assessed by whole-body computed tomography scan (CT) or positron emission tomography/computed tomography (PET-CT).

Treatment, Follow-Up, and
Outcome. Te frst-line therapy for DLBCL patients was the R-CHOP or R-CHOP-like regimen [8]. Addition of intravenous methotrexate (1 g/m 2 , four times) [9] injection was used as central prophylaxis in patients at central high risk. For elderly patients >70 years of age, we divided them into three groups: ft, unft, and frail according to the comprehensive geriatric assessment (CGA) [10], and they were treated with R-CHOP, R-mini-CHOP [11], and R2 (rituximab 375 mg/m 2 on d1, lenalidomide 25 mg on day 1-14) regimens, respectively. Follow-up was conducted by making phone calls, consulting medical records, or the electronic follow-up system we run. 61 (10.8%) patients were lost to follow-up until the fnal follow-up period of May 1, 2022. Overall survival (OS) was calculated from the time of diagnosis to death for any cause or the last follow-up.

Model Foundation and
Validation. 66.7% (n � 376) of the original dataset was randomly selected as a training cohort and the rest (n � 188) as a validation cohort. Univariate and multivariate Cox regression analyses were performed to screen potential variables associated with OS. Te selected signifcant variables (P value <0.5) were then used in the least absolute shrinkage and selection operator (LASSO) regression algorithm, and then, a predictive nomogram was constructed. Te area under receiver operating curve (AUC) is used to evaluate the performance of the model. According to the analysis results, calibration curves are drawn to determine whether the predicted and actual survival probabilities are consistent. Te total score for each patient was assessed using nomogram in an externally validated cohort and used as an independent factor for Cox regression analysis.

Statistical Analysis.
Te survival of patients was analyzed by Kaplan-Meier survival curve, and diferences between groups were compared by the log-rank test. Graph Pad Prisma 9.0 and R statistical software 4.1.3 (https://www. r-project.org/) were used to perform the statistical analyses. P value <0.05 was considered to be statistically diferent.

Construction and Validation of the Predictive Nomogram.
Te predictive model (Figure 3(c)) was constructed by 5 factors identifed from the results of Lasso regression. In this model, ECOG performance status ≥2 was assigned the highest score of 100, age > 60 y, bone marrow involvement, low levels of LMR, and low Alb was scored 86, 67, 58, and 28 points, respectively. Te AUC for the nomogram was 0.77 (95% CI: 0.70-0.82), and the calibration curves of the nomogram showed great consistency between the predicted OS rates and actual observations outcome (Figures 3(d) and 3(e)).
To clearly demonstrate the relationship between IPI scores and the new model's predictions in outcome of DLBCL patients, a Sankey diagram was constructed (Figure 6). We further categorized patients in the high-risk group (point 4-5) defned by the IPI score into subgroups of 117 patients in the nonhigh-risk group and 44 in the highrisk group by the new model. Te baseline clinical characteristics of the two subgroups are shown in Table 3.

Discussion
To our knowledge, our cohort had the best clinical outcome among the reported studies with the same sample size of patients in general hospitals in China.
Te median age of patients with DLBCL in our study was 58 years, which was consistent with the data reported by other research centers in Asia [14,15], but lower than those reported in other continents [16,17]. Compared with other studies [13,[18][19][20][21][22][23][24][25][26][27], especially in cancer hospitals in China [6,12], our cohort had a higher proportion of patients with advanced stage and combined B symptoms, which indicated that patients in our center have a heavier burden of disease. Primary extranodal DLBCL can originate from almost any part of the body, and the most common site of involvement in our cohort was the gastrointestinal tract. In addition, the involvement of the mammary gland, thyroid gland, and testis also occupied a large portion, which was consistent with previously reported data [28][29][30].
Multicenter data showed that the 5-year OS of DLBCL is about 64% in the rituximab era [30][31][32], while the survival of our cohort was better. Tis may be due to the availability of more new drugs, improvements in supportive care, and appropriate adjustments in treatment regimens. Previous prediction models [2,4] had shown that age, stage, ECOG PS, bone marrow involvement, and number of extranodal sites are momentous prognostic factors, and the data in our study were consistent. In addition, non-GCB [33] pathological subtype was a predictor of poor prognosis (5-year OS 73%).
Albumin levels are commonly used in lymphoma studies. Decreased albumin levels indicate the poor nutritional status of the patient or the consumption of the tumor. However, studies had shown that low Alb may be driven Te nomogram for predicting the OS of patients with DLBCL at 1, 3, and 5 years (c). Each patient's 5-factor score (i.e., age >60 years, 86 points) can be derived from the nomogram, and the sum of the above scores is added to the individual's total score. Te estimated probability of occurrence of this total score was the overall survival of the patient. Calibration curve for predicting OS at 5 years in the training (d) and the validation cohort (e).
6 Journal of Oncology more by proinfammatory status [34] or increased cytokine release [35] than by nutritional status. Biccler et al. [20] considered Alb as a predictor of poor clinical outcomes for patients with DLBCL. Similarly, our data suggested that patients with low Alb have signifcantly worse survival than other patients. In addition, McMillan et al. [36] considered albumin levels as a good predictor of disease progression. Patients with low albumin levels, especially older patients, were more prone to coinfection, which was also associated with a worse prognosis [37].
As an easily available biomarker, the role of LMR in predicting the survival of DLBCL had been increasingly emphasized. Absolute monocytes were positively associated with the number of tumor-associated macrophages, while the latter was associated with a worse prognosis of DLBCL [38]. Low absolute lymphocyte count suggested poor immune status and was associated with poor prognosis in patients with DLBCL. Terefore, lower LMR predicted worse clinical outcomes [39]. However, there is no uniform standard for the optimal cutof value of LMR, and a metaanalysis of patients with DLBCL showed that LMR ranged from 1.6 to 4 [40]. Terefore, the critical point determined by ROC curve in our study is 2.5.
Survival of patients with DLBCL had greatly improved in the last 20 years, but we realized that survival in high-risk patients was still poor. We developed a new prediction model to better distinguish high-risk patients. Based on the original IPI and NCCN-IPI, we removed some variables and added Alb and LMR. After verifcation of internal and external data, the predictive model we developed proved to have good predictive performance. And this model had better predictive power than those of IPI and NCCN-IPI. High-risk patients diferentiated according to our model had a worse prognosis.
Te main limitation of our study was that it is a singlecenter retrospective study and its results may not be fully applicable to all patients with DLBCL. In addition, selection bias was difcult to avoid. Te model we developed also needs to be validated by larger samples and external study cohorts.
In summary, we analyzed the clinical features of patients with DLBCL in our center and showed better survival. Ten, we constructed a new model with better predictive performance by identifying prognostic risk factors, which may help clinicians to better predict clinical outcomes for patients in the rituximab era.    Journal of Oncology 9