High SUVs Have More Robust Repeatability in Patients with Metastatic Prostate Cancer: Results from a Prospective Test-Retest Cohort Imaged with 18F-DCFPyL

Objectives In patients with prostate cancer (PC) receiving prostate-specific membrane antigen- (PSMA-) targeted radioligand therapy (RLT), higher baseline standardized uptake values (SUVs) are linked to improved outcome. Thus, readers deciding on RLT must have certainty on the repeatability of PSMA uptake metrics. As such, we aimed to evaluate the test-retest repeatability of lesion uptake in a large cohort of patients imaged with 18F-DCFPyL. Methods In this prospective, IRB-approved trial (NCT03793543), 21 patients with history of histologically proven PC underwent two 18F-DCFPyL PET/CTs within 7 days (mean 3.7, range 1 to 7 days). Lesions in the bone, lymph nodes (LN), and other organs were manually segmented on both scans, and uptake parameters were assessed (maximum (SUVmax) and mean (SUVmean) SUVs), PSMA-tumor volume (PSMA-TV), and total lesion PSMA (TL-PSMA, defined as PSMA − TV × SUVmean)). Repeatability was determined using Pearson's correlations, within-subject coefficient of variation (wCOV), and Bland-Altman analysis. Results In total, 230 pairs of lesions (177 bone, 38 LN, and 15 other) were delineated, demonstrating a wide range of SUVmax (1.5–80.5) and SUVmean (1.4–24.8). Including all sites of suspected disease, SUVs had a strong interscan correlation (R2 ≥ 0.99), with high repeatability for SUVmean and SUVmax (wCOV, 7.3% and 12.1%, respectively). High SUVs showed significantly improved wCOV relative to lower SUVs (P < 0.0001), indicating that high SUVs are more repeatable, relative to the magnitude of the underlying SUV. Repeatability for PSMA-TV and TL-PSMA, however, was low (wCOV ≥ 23.5%). Across all metrics for LN and bone lesions, interscan correlation was again strong (R2 ≥ 0.98). Moreover, LN-based SUVmean also achieved the best wCOV (3.8%), which was significantly reduced when compared to osseous lesions (7.8%, P < 0.0001). This was also noted for SUVmax (wCOV, LN 8.8% vs. bone 12.0%, P < 0.03). On a compartment-based level, wCOVs for volumetric features were ≥22.8%, demonstrating no significant differences between LN and bone lesions (PSMA-TV, P =0.63; TL-PSMA, P =0.9). Findings on an entire tumor burden level were also corroborated in a hottest lesion analysis investigating the SUVmax of the most intense lesion per patient (R2, 0.99; wCOV, 11.2%). Conclusion In this prospective test-retest setting, SUV parameters demonstrated high repeatability, in particular in LNs, while volumetric parameters demonstrated low repeatability. Further, the large number of lesions and wide distribution of SUVs included in this analysis allowed for the demonstration of a dependence of repeatability on SUV, with higher SUVs having more robust repeatability.


Introduction
Positron emission tomography (PET) with ligands targeting the prostate-specific membrane antigen (PSMA) is being increasingly utilized, with applications including treatment planning in patients with metastatic prostate cancer (PC) [1,2]. The accessibility of the PSMA active site to highaffinity ligands, combined with rapid internalization, allows for accurate, noninvasive high-contrast imaging [3]. Given its facile synthesis without need for a cyclotron, 68 Ga-labeled radiotracers have been, to date, widely used. However, recent years have also witnessed an increased use of 18 F-labeled radiotracers, initially with 18 F-DCFBC [4] and other firstgeneration compounds, and later more widely available radiotracers such as 18 F-PSMA-1007, 18 F-rhPSMA-7 [5,6], and 18 F-DCFPyL (piflufolastat F18, PYLARIFY®) [7]. The latter agent has been extensively investigated in major clinical trials [8,9], including the multicenter phase 3 CONDOR and in the phase 2/3 OSPREY trials [10,11], demonstrating positive predictive values of 78-91% in both detecting PC in pelvic lymph nodes (LN) and distant metastases. Based on the encouraging results, 18 F-DCFPyL recently received approval from U.S. Food and Drug Administration (FDA) [12]. As a nationwide, commercially available, 18 F-labeled PSMA PET agent [12], one may anticipate an increased use of this radiotracer in both clinical routine and for trials.
The repeatability of uptake features is an important property of 18 F-DCFPyL to understand response assessment, e.g., in a theranostic setting or in men starting abiraterone or enzalutamide [8,13]. If rigorously executed, standardization of imaging protocols and continuously calibrated PET devices allow for high test-retest repeatability [14], but biological aspects or interpatient and intrapatient variability can have a significant impact on quantitative features in repeated imaging studies [15].
In this regard, a recent study has reported high repeatability for 36 lesions in 12 patients using 18 F-DCFPyL [16]. In this prospective clinical trial, we aimed to elucidate the repeatability of quantitative parameters on 18 F-DCFPyL PET in a test-retest cohort by enrolling 21 men with PC with a total of 230 visible lesions. This relatively large cohort with a corresponding large number of disease sites enabled evaluation of repeatability among different organ compartments, such as in LN or osseous lesions, and among a wide range of SUVs. In addition, such an approach also allowed us to assess the dependence of SUV on original and relative units (in %) and to determine whether higher SUVs have improved repeatibility when compared to lower SUVs. This may be of importance for response assessment studies, where percentage change in SUV by comparing baseline and follow-up scans is a method to define progressive disease [17] or follow response in patients receiving PSMAtargeted radioligand therapy (RLT). Of note, higher SUVs from PSMA PET are linked to better early biochemical response [18] and overall survival [19] in patients under PSMA-directed treatment. Thus, the reader deciding on the appropriateness of RLT must have certainty on the reliability of these semiquantitative parameters.

Materials and Methods
This study was registered at ClinicalTrials.gov (NCT03793543) and was carried out under a United States FDA Investigational New Drug Application (IND121064). The Institutional Review Board of the Johns Hopkins Hospital approved this prospective study (IRB00174393). Table 1. 21 patients with mean age 65:4 ± 9:4 years with history of PC were included in this trial. Among others, required inclusion criteria for the study were as follows: (1) age ≥ 18 years, (2) history of histologically or cytologically confirmed adenocarcinoma of the prostate without neuroendocrine differentiation, (3) patients with metastatic castration-sensitive or castration-resistant prostate cancer (CRPC) with evidence of metastatic disease on conventional imaging with computed tomography (CT) and/or bone scan, and (4) Eastern Cooperative Oncology Group performance status of ≤2 [14].

Patients. Patient characteristics are displayed in
The exclusion criteria were as follows: (1) serious or uncontrolled coexistent nonmalignant disease, including active and uncontrolled infection; (2) administration of a radioisotope ≤ 5 physical half-lives prior to the first PET/ CT; and (3) administration of an intravenous X-ray contrast medium ≤24 hours or oral contrast medium ≤120 hours prior to the first PET/CT.

Imaging
Protocol. 18 F-DCFPyL was synthesized as previously described [7]. The imaging protocol followed current guidelines [20]. Patients were scanned in the supine position starting from the mid-thigh to the vertex of skull (whole body protocol) at approximately 60 min postinjection. PET/CT was obtained using a 128-slice Biograph mCT (Siemens Healthineers, Erlangen, Germany) with low-dose CT attenuation correction (no contrast, 120 kV, 40 effective mAs, 0.5 tube rotation time, and 0.8 pitch). Standard ordered-subset expectation maximization reconstructions 2 Molecular Imaging with time-of-flight were used. A subsequent near-term 18 F-DCFPyL PET/CT follow-up scan with identical imaging protocol was conducted to assess test-retest repeatability.
No change in therapy occurred between the scans.

Image Analysis.
A consensus central review was carried out with all images analyzed by three physicians with experience in the interpretation of PSMA-targeted PET/CT (BH, RAB, and RAW, having at least 3 years of experience in reading scans) who were blinded to clinical data. Images were analyzed using the InterView Fusion software (Version 3.08.005.0000, Mediso Medical Imaging Ltd., Budapest, Hungary) for lesion identification and segmentation. As described in [21], the entire volume of all 18 F-DCFPyL-avid tumor lesions (i.e., tumor burden) was manually segmented using volumes of interest. Mean and maximum standardized uptake values (SUV mean , SUV max ) were assessed. In addition, tumor volume (TV) was computed, which allowed for calculation of total-lesion PSMA (TL-PSMA, defined as TV × SUV mean ) [22].

Statistical Analysis.
Corresponding uptake parameters were compared between both scans. Scatter diagrams were plotted, and linear regression analysis was performed. Bland-Altman plots were created for both absolute and relative differences of these data (expressed as a percentage), including upper and lower levels of agreement [23,24]. For correlation of uptake, Pearson correlation was performed (providing R 2 ). Kendall's tau (τ) was also used for correlational analyses with τ ≥ 0:40 indicating strong correlation [25,26]. The within-subject coefficient of variation (wCOV, in %) was assessed [27]. For comparison of different wCOVs, the method of Forkmann was used [28]. A lesionbased head-to-head comparison including LN, osseous, and other lesions was conducted. Moreover, to assess for a dependence of the repeatability on different parameters, all lesions were subdivided into a group below ("< median") vs. above ("> median") the corresponding median value. In addition, the hottest lesion per patient (defined as metastatic site of disease with the highest SUV max among all lesions) was also analyzed. A P value <0.05 was considered statistically significant. Statistical analysis was performed with MedCalc software (Version 19.6, MedCalc software Ltd., Ostend, Belgium) and Microsoft Excel 2016 (Microsoft Cooperation, Redmond, WA, USA).  Figure 1 shows a test-retest scan of a patient with low and Figure 2 with high tumor burden. An overview of uptake parameters including SUV max , SUV mean , TL-PSMA, and PSMA-TV can be found in Table 2.

Repeatability Parameters on a Compartment-Based
Level. When investigating different types of lesions,  Table 2). Due to small number, visceral lesions were not analyzed further.

Discussion
230 lesions on 21 18 F-DCFPyL PET/CTs were utilized to demonstrate overall high repeatability of uptake. Volumetric features revealed relatively lower repeatability, while SUV mean not only demonstrated the highest correlative indi-ces (τ, 0.92-0.95) but also the best repeatability, in particular for LN (wCOV 3.8%). For SUV max , robust correlations along with at least intermediate repeatability were noted in LN and osseous lesions, suggesting SUV as a reliable metric for quantitative assessments. For 18 F-DCFPyL PET, SUV-based parameters might be an acceptable alternative to volumetric parameters [8]. Importantly, we observed an improved repeatability for higher SUVs when considered relative to the level of uptake (relative units). 18 F-DCFPyL is a U.S-wide, FDA-approved, PSMAtargeted, radiolabeled imaging agent for patients with PC [8,9,12] and a more worldwide use can be anticipated, indicating the importance of a thorough understanding of this agent. The high repeatability of uptake parameters, both overall and based on metastasis type, is of importance, as it suggests that 18 F-DCFPyL may be useful for therapy response assessment and also that manual and automated (e.g., artificial intelligence) methods for lesion detection should be repeatable and reliable [13,29,30].  Molecular Imaging Previous studies have revealed comparable correlations and repeatability, but differences relative to the present trial must be noted. For instance, in a preceding analysis based on 68 Ga-PSMA PET in a test-retest setting [31], the authors reported substantially higher wCOV, e.g., for SUV mean derived from LN. Further, no significant differences between lesion type were observed with the 68 Ga-labeled PSMA imaging agent [31]. This may be partially explained by the improved diagnostic accuracy of radiotracers labeled with 18 F [32]. Intrinsic physical factors of 68 Ga may contribute to the partial volume effect, which in turn has an impact on semiquantitative values such as SUV [33], potentially explaining such different wCOVs.
A recent study by Jansen et al. also reported on testretest properties for 18 F-DCFPyL, including a total of 36 lesions [16]. Similar to our findings, SUV mean had a better repeatability when compared to SUV max [16]. However, no significant differences between LN and osseous lesions were identified in the previous trial, but a trend towards significance was noted (P = 0:06) [16]. In our study, significant differences between lesions located in the skeleton and LN were determined, possibly due to the increased number of subjects and lesions [16]. In this regard, relative to the investigation of Jansen et al. [16], more lesions were included (230 vs. 36) providing a broad range of SUV (1.4-80.5). This allows us to demonstrate a dependence of repeatability on SUV, with higher SUVs having a higher repeatability, in particular for relative SUV max values ( Figure 3, third column). This observation is of importance, as absolute SUVs have different ranges depending on their normalization schemes, whereas relative differences allow for intra-and interindividual comparisons [34]. In addition, this marked dependence of SUV on relative units may be clinically relevant, e.g., for response assessment studies, where it is common to indicate percentage change in SUV by comparing baseline and follow-up scans, as recently demonstrated for 18 F-DCFPyL [17]. Assessment of delta % has also been recently suggested by the PSMA PET Progression Criteria, with an increase in PSMA uptake of 30% indicating progressive disease [35]. As such, the observed improvement of relative repeatability at the higher SUVs may be important for future multicenter trials, e.g., for 18 F-DCFPyL-based therapy response monitoring [8] or for patients scheduled for RLT.
In this regard, no study to date has explored the predictive potential of 18 F-labeled PSMA PET for subsequent outcomes in patients with PC scheduled for PSMA-directed therapy [36]. The repeatability of SUV max units demonstrated in this study may lay the foundation for future

Molecular Imaging
investigations of the utility of 18 F-DCFPyL PET in monitoring 177 Lu-based RLT. These considerations are further fueled by the fact that in patients scheduled for PSMA-targeted RLT, high average SUVs on baseline PSMA PET are frequently observed (up to 73.4) and that increased baseline SUV max were linked to improved early biochemical response (cut-off, >19.8) [18] and overall survival (cut-off, >14.3) [19]. In this analysis, the highest SUV max was 80.5, and thus, the results are relevant to the patient population undergoing RLT. The higher repeatability at higher SUV max may be of importance in the theranostic setting, as the reader deciding on RLT has certainty that such findings are not related to measurement variability, suggesting SUV max is a reliable imaging biomarker to identify high risks prone to treatment failure. This also applies regardless if lesions are located in the skeleton (Figure 2) or LN (Figure 1), as repeatability of SUV max was high to intermediate among metastases allocated to different organ compartments (Table 2).
Both 68 Ga-and 18 F-labeled, PSMA-directed radiotracers demonstrate that the best repeatability is found with SUV, whereas values for TV may have to be interpreted with caution [16,31]. As a possible explanation, the latter parameter may be subject to an operator-dependent bias of manual segmentation. Fully automated delineation software may increase repeatability, e.g., when artificial intelligence such as deep learning is applied [37]. Moreover, state-of-the-art reconstruction algorithms such as pointspread function (PSF) may also recategorize lesions as more definitive sites of disease attributable to PC, as recently demonstrated for 18 F-DCFPyL [38]. However, the effect of PSF on repeatability in patients scheduled for 18 F-DCFPyL has also been reported, with PSF reconstruction significantly having a negative impact on repeatability for SUV, but not for TV [16]. Given these contradictory results of increased interpretative certainty and decreased repeatability by implementing PSF, future studies should explore the impact of novel and advanced reconstruction algorithms on test-retest metrics.
This study has several limitations. Although providing the largest cohort of patients and lesions to date, some patients had a disproportionate number of lesions and clustering effects from that lesion distribution may have effected the results. Therefore, a hottest lesion analysis investigating the metastatic site with the highest SUV max per subject was also performed. Again, a high repeatability with no systematic increase or decrease was noted (Supplementary Figure), further corroborating the findings including all suspected sites of disease. Moreover, lesion size, dose, and patient factors including interpatient and intrapatient variability can have a significant impact on semiquantitative assessments using this radiotracer [15,39]. Therefore, future studies should also consider controlling for such day-to-day variables [16]. Partial volume effects are almost certainly a factor in repeatability in small lesions, and future test-retest studies might exclusively enroll patients with extensive tumor burden. Such an approach would then corroborate our present findings across a broad spectrum of tumor burden. Despite enrolling the largest cohort of Table 2: Head-to-head comparison of semiquantitative parameters for both scans, for all lesions (n = 230), osseous (n = 177), and lymph node lesions (n = 38), mean value and standard deviation along with respective Pearson correlation, Kendall's tau (τ), and within-subject coefficient of variation (wCOV). Regardless which statistical test was used, mean standardized uptake value (SUV mean ) achieved the highest correlative indices (Pearson correlation, Kendall's τ) and the lowest wCOV, indicating excellent repeatability, in particular for lymph node disease (marked in italic). Volumetric features, however, revealed lower τ and still acceptable repeatability, as indicated by increased wCOV. SUV max : maximum standardized uptake value; PSMA-TV: PSMA tumor volume; TL-PSMA: total lesion PSMA. 6 Molecular Imaging patients in a prospective test-retest setting for 18 F-DCPFyL to date, the number of patients with different therapies was too small to provide reliable results for a subanalysis focusing on prior therapeutic regimens. This should also be addressed in future studies.

Conclusion
Our results demonstrate that 18 F-DCFPyL has highly repeatable uptake parameters in PC lesions. Further, the large number of lesions and wide distribution of SUVs included 7 Molecular Imaging in this analysis allowed for the demonstration of a dependence of repeatability on original and relative SUVs, with higher SUVs having more robust repeatability. This observed improvement of repeatability at increased SUVs may be important for future multicenter trials, e.g., for 18 F-DCFPyL-based response monitoring in patients under antihormonal treatment.

Data Availability
The data are not publicly available because, due to the European regulations regarding data protection, we cannot make data available online or disburse them. However, all data are available for revision on-site.

Ethical Approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The Institutional Review Board of the Johns Hopkins Hospital approved this prospective study (IRB00174393).

Consent
Informed consent was obtained from all individual participants included in the study.

Disclosure
Results of this work are part of the doctoral thesis of BH, planned to be submitted at the Medical Faculty of Bonn University. This publication was supported by the Open Access Publication Fund of the University of Wuerzburg.

Conflicts of Interest
MGP is a coinventor on a US patent covering 18 F-DCFPyL and as such is entitled to a portion of any licensing fees and royalties generated by this technology. This arrangement has been reviewed and approved by the Johns Hopkins University in accordance with its conflict of interest policies. SPR is a consultant for Progenics Pharmaceuticals Inc., the licensee of 18 F-DCFPyL. MAG has served as a consultant to Progenics Pharmaceuticals. SPR is a consultant for Progenics Pharmaceuticals. All other authors declare that there is no conflict of interest as well as consent for scientific analysis and publication.

Authors' Contributions
All authors contributed to writing, critically reviewing, and approving the paper. Specific author contributions are as follows: conceptualization was done by SPR, MGP, RAW, TH, MAG, KJP, MAE, MCM, and MAL; methodology was done by RAW, MGP, SPR, LB, RAB, TD, BH, TH, RA, and AS; software was acquired by RAW, BH, SL, PH, and SES; validation was done by MGP, AKB, TD, CL, ME, KJP, MAL, and LS; formal analysis was done by RAW, RAB, BH, SL, MAL, and SPR; investigation was done by RA, AS, and SPR; visualization was done by TH, RAB, and LB; supervision was done by MGP, MAL, SPR, KJP, and MCM; project administration was done by RA, AS, and SPR; funding acquisition was done by MGP, SPR, TH, RAW, and TD. Rudolf A. Werner, Bilêl Habacha, Ralph A. Bundschuh, and Steven P. Rowe equally contributed to this work.

Funding
Funding for this study was received from the Prostate Cancer Foundation Young Investigator Award, Movember Foundation, and National Institutes of Health grants CA134675, CA184228, EB024495, and CA183031. This work was supported by HiLF at Hannover Medical School (TD, RAW) and "RECTOR" Program at Okayama (TH). A KAKENHI grant (21K19450) has been provided for Dr. T. Higuchi from the Japan Society for the Promotion of Science (JSPS). MAG and SPR have received research finding from the Progenics Pharmaceuticals.

Supplementary Materials
Supplementary Figure: hottest lesion analysis in a test-retest setting, with correlation of (A) maximum standardized uptake values (SUV max ), (B) mean standardized uptake values (SUV mean ), and (C, D) corresponding Bland-Altman plots. An excellent correlation between test and retest scans along with a considerable low magnitude of limits within standard deviations (SD) was noted. Supplementary Table: differences in within-subject coefficient of variations (wCOVs, in %) for all parameters, divided into a group below (<) vs. a group above (>) the corresponding median value (n = 115 per group). Regardless of the investigated parameters, lesions above the median had a more robust repeatibility, which was markedly better for standardized uptake value (SUV), in particular for SUV mean . SUV max : maximum SUV; PSMA-TV: PSMA tumor volume; and TL-PSMA: total lesion PSMA. P has been derived from comparison of wCOV from lesions below vs. above the respective median. SD: standard deviation. (Supplementary  Materials)