Predictive Value of Interim PET/CT in DLBCL Treated with R-CHOP: Meta-Analysis

Objective. We conducted a meta-analysis to evaluate the predictive value of interim 18F-FDG PET/CT in patients with DLBCL treated with R-CHOP chemotherapy. Methods. We searched for articles published in PubMed, ScienceDirect, Wiley, Scopus, and Ovid database from inception to March 2014. Articles related to interim PET/CT in patients with DLBCL treated with R-CHOP chemotherapy were selected. PFS with or without OS was chosen as the endpoint to evaluate the prognostic significance of interim PET/CT. Results. Six studies with a total of 605 cases were included. The sensitivity of interim PET/CT ranged from 21.2% to 89.7%, and the pooled sensitivity was 52.4%. The specificity of interim PET/CT ranged from 37.4% to 90.7%, and the pooled specificity was 67.8%. The pooled positive likelihood ratio and negative likelihood ratio were 1.780 and 0.706, respectively. The explained AUC was 0.6978 and the Q * was 0.6519. Conclusions. The sensitivity and specificity of interim PET/CT in predicting the outcome of DLBCL patients treated with R-CHOP chemotherapy were not satisfactory (52.4% and 67.8%, resp.). To improve this, some more work should be done to unify the response criteria and some more research to assess the prognostic value of interim PET/CT with semiquantitative analysis.


Introduction
The use of positron emission tomography/computed tomography (PET/CT) with 18 F-fluoro-2-deoxy-D-glucose ( 18 F-FDG) for staging, monitoring treatment, and restaging in patients with lymphoma has remarkably expanded in recent years. Numerous studies reported that patients with a negative scan showed both a better progression-free (PFS) and a better overall survival (OS) and these results helped clinicians make further treatment decisions. Diffuse large B cell lymphoma (DLBCL) is the major histologic subtype of aggressive non-Hodgkin lymphoma (NHL) and comprises about 30% of it [1]. Most patients with DLBCL can be cured by chemotherapy, with or without radiotherapy. The most widely adopted first-line therapy is R-CHOP (rituximab, cyclophosphamide, doxorubicin, vincristine, and prednisolone). However, still about 20-40% of patients cannot be cured with R-CHOP and may need salvage therapy, such as high-dose chemotherapy followed by autologous stem-cell transplant (ASCT) [2]. Thus, it is important to identify the poor responders to first-line treatment, in order to switch them to alternative treatments as early as possible. The 18 F-FDG PET/CT scan is recommended at the baseline and end of treatment for DLBCL patients [3]. The interim 18 F-FDG PET has shown high predictive value in Hodgkin's lymphoma (HD). However, the role of interim PET/CT in patients with NHL, including DLBCL, is still unconfirmed [4,5]. The variability in patient population, treatment regimens, timing of interim PET, and nonstandardized FDG-PET interpretation criteria evident in these studies were the reasons to state that no reliable conclusions could be drawn from their analysis of interim PET used in DLBCL [6,7].
In this paper, we performed a meta-analysis concentrating on the interim 18 F-FDG PET/CT in DLBCL patients treated with R-CHOP chemotherapy. keywords (PET or positron emission tomography), (DLBCL or diffuse large B cell lymphoma), humans, and English. Fulltext articles were reviewed when abstracts did not provide sufficient information for determination. Furthermore, the reference lists of retrieved articles were examined for additional relevant studies. Exact search strategies can be found in Figure 1.

Study Selection.
Two investigators independently reviewed the abstracts and further examined the full-text articles to select studies that met the inclusion criteria as follows: (1) studies that evaluated the predictive value of PET/CT performed between the first and the fourth cycle of first-line chemotherapy (R-CHOP) for patients with DLBCL; (2) studies that evaluated at least 10 patients and included at least five patients who progressed during chemotherapy through clinical follow-up; (3) studies using positive and negative results of 18 F-FDG PET/CT as a predicting factor according to SUV max value or visual analysis.
Besides, when a study included patients who were treated with R-CHOP and other chemotherapy, we included it only if subgroup data on R-CHOP were separately extractable. We excluded abstracts, editorials, comments, letters, review articles, and studies that enrolled patients with HIV associated or posttransplant lymphoproliferative disorders.
Many studies did not meet all the inclusion criteria but did partially include a relevant patient population. For these studies, we contacted the authors for relevant individual patient or subgroup data. When there was no response after 4 weeks, another correspondence was sent. When there was no response after the third communication attempt, we considered the request rejected.

Data Extraction and Quality Assessment.
Two investigators independently reviewed the selected studies and retrieved data on author, publication year, patient characteristics, and study design. PFS with or without OS data was chosen as the endpoint to evaluate the prognostic significance of interim PET/CT. For each study, the numbers of truepositive (TP), false-positive (FP), false-negative (FN), and true-negative (TN) results were calculated. Study quality was assessed with QUADAS (quality assessment of studies of diagnostic accuracy included in systematic reviews) checklist and with maximum score, 14 [13].

Data Analysis.
We constructed a 2 × 2 contingency table consisting of TP, FP, FN, and TN results. We calculated sensitivity and specificity for each study. The summary sensitivity, specificity, and positive and negative likelihood ratios (LRs) of the included studies were also calculated. We assessed study heterogeneity by plotting sensitivity and specificity in the receiver operating characteristic (ROC) space and drew summary ROC curves and confidence regions for summary sensitivity and specificity. As a global measure for the summary ROC curves, we estimated the * statistic, the point on the ROC curve where sensitivity and specificity were equal. Data analyses were conducted with Meta-Disc1.4 software. All tests were two-sided and statistical significance was defined as a value < 0.05.

Eligible Studies.
We identified 330 potentially relevant studies, of which 324 studies were rejected. The detailed study selection process was described in Figure 1. 306 studies were excluded for not being related to interim PET/CT in DLBCL. 16 studies were excluded because subgroup data on DLBCL treated with R-CHOP were not separately extractable. 1 study was excluded because less than 5 patients progressed during chemotherapy through clinical follow-up, and 1 study was excluded for being a duplicate study. Finally, there were six studies included in the final analysis.
The baseline characteristics of the six included studies were shown in Table 1 [14,15]. The detailed criteria of the 6 included studies were as follows: three studies used the visual method while the other three used both visual and semiquantitative method (ΔSUV max ). Among the three studies used both methods, the subgroup data of two studies [2,8] were extracted separately as follows: (a) the subgroup data of the visual method; (b) the subgroup data of the semiquantitative method. Three of these studies achieved definite statistical significance while the other three showed undetermined results.

Quality Assessment.
Two reviewers independently assessed the quality items and discrepancies were resolved by discussion. The global quality score ranged from 11 to 13 (Table 1).

Data Analysis.
Heterogeneity is a potential problem when interpreting the results of meta-analysis. The threshold effect must be considered firstly in test accuracy studies, which arises when differences exist in sensitivity and specificity due to different cut-off or threshold used in different studies to define a positive or negative test result [16,17]. We used Spearman correlation coefficient to analyze the threshold effect, and its value was 0.970 ( = 0.000), which indicated that there was heterogeneity from threshold effects. The possible sources of nonthreshold effect heterogeneity included study design, methodologic study quality, and diagnostic criteria for PET timings. We used diagnostic odds ratio (DOR) to analyze the heterogeneity from nonthreshold effects. The Cochran-Q = 2.39 and = 0.9348 (as shown in Figure 2), which indicated that there was no heterogeneity from nonthreshold effects.
The AUC is used to summarize the overall diagnostic accuracy. As seen in Figure 4, of 6 included studies, the AUC was 0.6987 and the maximum joint sensitivity and specificity, * , was 0.6519.

Discussion
The prognostic value of interim PET/CT performed during first-line therapy of patients with DLBCL is still unclear. Previous studies showed poor reproducibility and inconsistent accuracy and sensitivity of interim PET/CT due to different treatment modalities and response criteria. In an attempt to standardize interim PET/CT reporting criteria, the "First International Workshop on Interim PET in Lymphoma, " created in 2009, developed a consensus of response criteria for the interim PET. The response criteria were mainly based on visual and semiquantitative analysis. The visual response criteria used the Deauville five-point scale (5-PS): 1, no uptake; 2, uptake ≤ mediastinum; 3, uptake > mediastinum but ≤liver; 4, uptake moderately increased compared to the liver uptake at any site; and 5, markedly increased uptake    compared to the liver at any site and new sites and/or new sites of disease. As seen in Table 1, of the 6 included studies, 2 used the Deauville five-point scale (5-PS). For semiquantitative analysis, since maximal standardized uptake value (SUV max ) is the most commonly used semiquantitative method of PET analysis in oncology, assessment of the decrease in SUV max after a few cycles of chemotherapy compared with basal or pretreatment SUV expressed as a percentage (ΔSUV max ) can be useful in interim PET evaluation [18]. Spaepen et al. [19] reported the value of interim PET in predicting the outcome of DLBCL patients who had been treated with different chemotherapy regimens using delta-SUV-based criteria. Lin et al. [20] found a ΔSUV max of 65.7% to be the best cut-off level for differentiating patients with good or bad prognosis, with a very high degree of interobserver reproducibility. However, since the patients included in the 6 studies ranged from 2004 to 2010, not each of them was evaluated with the response criteria developed by the "First International Workshop on Interim PET in Lymphoma, " which contributed to the heterogeneity from the threshold effect.
In this study, we selected newly diagnosed DLBCL patients treated with R-CHOP. Our research showed that, due to different response criteria, studies had obvious threshold effect. SROC was used to summarize the overall test performance; and AUC was calculated to evaluate the indicator. The significance of AUC was that the AUC in the region of 0.97 or above is considered to have excellent accuracy, an AUC of 0.93 to 0.96 is very good; an AUC of 0.75 to 0.92 is good; and an AUC of less than 0.75 should be cautiously evaluated for the test may have obvious deficiencies in accuracy and is approaching the random test [21]. With these criteria, the results showed that interim PET/CT had deficiencies in accuracy in predicting the outcome of DLBCL patients treated with R-CHOP with an AUC of 0.6987.
There are several potential limitations to conducting a meta-analysis of diagnostic tests. First, many studies did partially include a relevant patient population meeting all the inclusion criteria. For these studies, we contacted the authors for relevant individual patient or subgroup data. Unfortunately, we got no responses and cannot get enough evidences to confirm the prognostic role of interim PET/CT. Second, due to the fact that patients included in the above studies ranged from 2004 year to 2010 year, only 3 of the 6 included studies used semiquantitative analysis. Lin et al. [20] found that SUV-based assessment of therapeutic response during first-line chemotherapy improved the prognostic value of early 18 F-FDG PET compared with visual analysis in DLBCL. Casasnovas et al. [22] showed that SUV max reduction improved early prognosis value of interim positron emission tomography scans in diffuse large B cell lymphoma. So maybe some more research should be done to assess the prognostic value of interim PET/CT with semiquantitative analysis.  Third, studies included were retrospective and we suggest that larger prospective, high-quality, and multicenter studies should be conducted for DLBCL.

Conclusion
Just as shown in our study, the pooled sensitivity and specificity of interim PET/CT in predicting the outcome of DLBCL patients treated with R-CHOP chemotherapy were not satisfactory as expected. To improve this, some more work should be done to unify the response criteria and some more research to assess the prognostic value of interim PET/CT with semiquantitative analysis.