The Role of [18F]FDG-PET/CT in Predicting Malignant Transformation of Plexiform Neurofibromas in Neurofibromatosis-1

Background. Malignant peripheral nerve sheath tumours (MPNSTs) are difficult to diagnose and treat and contribute to significant morbidity and mortality for patients with Neurofibromatosis-1 (NF-1). FDG-PET/CT is being increasingly used as an imaging modality to discriminate between benign and malignant plexiform neurofibromas. Objectives. To assess the value of FDG-PET/CT in differentiating between benign and malignant peripheral nerve lesions for patients with Neurofibromatosis-1. Methods. A systematic review of the literature was performed prior to application of stringent selection criteria. Ultimately 13 articles with 796 tumours were deemed eligible for inclusion into the review. Results. There was a significant difference between mean SUVmax of benign and malignant lesions (1.93 versus 7.48, resp.). Sensitivity ranged from 89 to 100% and specificity from 72 to 94%. ROC analysis was performed to maximise sensitivity and specificity of SUVmax cut-off; however no clear value was identified (range 3.1–6.1). Significant overlap was found between the SUVmax of benign and malignant lesions making differentiation of lesions difficult. Many of the studies suffered from having a small cohort and from not providing histological data on all lesions which underwent FDG-PET/CT. Conclusion. This systematic review is able to demonstrate that FDG-PET/CT is a useful noninvasive test for discriminating between benign and malignant lesions but has limitations and requires further prospective trials.


Introduction
Neurofibromatosis type one (NF-1) is a common inherited disorder affecting from 1 in 2,000 to 1 in 5,000 live births; it is an autosomal dominant condition characterized by cutaneous lesions, skeletal dysplasias, and the tendency to form soft tissue tumours on peripheral nerves such as plexiform neurofibromas. These plexiform neurofibromas have potential for sarcomatous transformation to malignant peripheral nerve sheath tumours (MPNSTs) [1,2]. MPNSTs represent 5-10% of all soft tissue sarcomas and are more prevalent in NF-1, contributing significantly to the morbidity and mortality of these patients. MPNSTs carry an ominous prognosis with a 5-year survival of up to 60% due to delayed diagnosis, early metastasis, and poor response to systemic therapy [1,3].
The mainstay of treatment for MPNSTs remains surgical excision. The ability for benign neurofibromas to mimic MPNST with clinical features such as increased growth rate, irregular contour, and pain has made diagnosis without surgical excision for histology challenging. Contrast enhanced CT and MRI have been shown to be a suboptimal imaging modality for diagnosis of potential MPNST in NF-1; despite being helpful in detecting nodular lesions these imaging modalities have variable potential in differentiating between benign and malignant disease and limited ability to quantitatively analyse suspicious lesions [4][5][6].
[ 18 F]2-Fluoro-2-deoxy-D-glucose PET/CT (FDG-PET/ CT) is an imaging modality that noninvasively assesses in vivo glucose metabolism and is commonly used to stage and monitor treatment response and investigate for recurrence in solid tumour malignancies. A maximal standardized uptake 2 International Journal of Surgical Oncology value (SUV max ) is a unitless semiquantitative measure of FDG uptake and is used to assess the metabolic activity within a potentially malignant tumour. The use FDG-PET/CT for the diagnosis of MPNST in patients with NF-1 has been a key area of research. Establishing a noninvasive way of diagnosing MPNSTs may lead to earlier treatment and improved prognosis. The use of tissue histopathology to establish a definitive diagnosis is highly specific but may require complete surgical excision of a suspect tumour which carries increased morbidity and mortality to the patient as well as a technical challenge.
There have been multiple published studies assessing the use of semiquantitative FDG-PET/CT analysis by calculating the maximal standardized uptake value (SUV max ) within a tumour to differentiate between benign and malignant peripheral nerve sheath tumours with varying results. The purpose of this systematic review is to synthesise and appraise the current evidence on the role of FDG-PET/CT in diagnosing MPNST as well as potential future research directions.

Study Selection and Eligibility Criteria.
A review of the English language articles on online databases PubMed, MED-LINE, Embase, and Scopus was performed using MeSH/ key terms "Nerve Sheath Neoplasms", "Positron-Emission Tomography", "Neurofibromatosis 1", and "Peripheral nerve sheath tumors".
Eligible publications were required to have included SUV max for the semiquantitative analysis of plexiform neurofibromas to diagnose malignant transformation with the reference standard as histopathological correlation or informed clinical follow-up. Articles focusing primarily on the use of other quantitative variables such as the tumourto-liver ratio were not included. Case reports, conference abstracts, posters presentations, book chapters, and review articles were excluded from this review. Figure 1 describes the study selection process.

Synthesis of Results.
Qualitative analysis of the studies was performed with contribution from each study's data items of interest. A meta-analysis was not performed due to a lack of sufficient homogeneity between the studies. Data was extracted using a predetermined standardized table.

Results
A total of 97 articles were found once duplicates were removed, the abstracts of these articles were reviewed and 23 articles were excluded due to being small case reports; 14 articles were excluded due to being review articles, book chapters, abstracts for presentation, and posters or pertaining to veterinary science. 60 full-text articles were then reviewed for relevance to our clinical question; further 47 articles were excluded with 3 papers containing duplicate data. Review of the citations for all relevant studies did not yield any further trials. Ultimately 13 articles were eligible for this review; there were no randomised control trials identified for this review. Table 1 depicts the demographics of included studies.

Imaging Modalities.
Included studies all had patients administered intravenous [ 18 F]2-fluoro-2-deoxy-D-glucose following a fasting time of 4-6 hours. Whole body PET with CT imaging was then acquired. The interval between administration of FDG and acquisition of imaging varied slightly, early images were taken between 45 and 90 minutes after administration, and for articles which included delayed imaging this was performed 240 minutes after administration of FDG. Table 2 shows characteristics of the included studies regarding their SUV max of malignant and benign lesions as well as the sensitivity, specificity, positive predictive value, negative predictive value, and accuracy. Mean SUV max was 1.93 and 7.48 for benign and malignant lesions, respectively, across all trials.

Diagnostic Potential of FDG-PET/CT.
The medians for PPV and NPV were 40% and 100%, respectively, across the studies with a mean accuracy of 83.5%.

Optimum SUV
Cut-Off. Optimum SUV max to maximise sensitivity and specificity is shown in Table 2. A wide range is noted. Sensitivities ranged from 91% to 100% although specificity ranged from 72% to 95%.

Primary and Secondary Outcomes.
There is a noted difference in the SUV max of benign versus malignant plexiform neurofibromas as shown in Table 2; there is however considerable overlap in ranges of SUV max for benign and malignant lesions. FDG-PET/CT was shown to be effective in the diagnosis of malignant lesions with the mean sensitivity of 91%. There was insufficient evidence to accept a universal cut-off value for SUV max (ROC cut-off ranging from 3.1 to 6.1) which is reinforced with the aforementioned range in specificity.

Comparative Data.
There is difficulty to directly compare these studies due to lack of sufficient homogeneity. Two of the included trials focus on the paediatric population [14,15].  [7] article showed that quantitative analysis malignant lesions had a statistically significant increase in SUV max compared to benign lesions (mean 5.4 ± 2.4 versus 1.54 ± 0.7, resp., = 0.002). This finding of a statistically significant difference in means was also evident in articles published by Cardona et al. and Bredella et al. [8,9]. Qualitatively 2 benign tumours were classified as malignant; however there were no false negatives and an overlap of SUV max readings for benign and malignant   n/a n/a n/a n/a n/a n/a Cardona et al. [8] n/a 1.0 4.1 100 83 n/a n/a n/a n/a n/a n/a n/a n/a Karabatsou et al. [11] 60 2.6 10.4 n/a n/a n/a n/a n/a n/a Warbey et al. [12] 90 and 240 2 (1.9) a 7 (8.1) a 97 87 n/a n/a n/a 3. Meany et al. [17] 60-90 n/a n/a n/a n/a n/a n/a n/a n/a tumours was identified between SUV max 2.7 and 3.3 by Ferner et al. [7]. This range of overlapping SUV max values between benign and malignant lesions remains the greatest issue with FDG-PET/CT in this context. Tumours which present as a false positive tend to be within this overlapping region as evidenced by Table 2 that malignant lesions will likely have a much higher SUV max on average. In 2008 Ferner et al. [10] performed a follow-up trial that included FDG-PET/CT with 4-hour delay from injection of tracer to proceeding with imaging as this was determined to be the optimal time in their previous study [7]. Mean SUV max of benign and malignant lesions were 1.5 and 5.7, respectively. No malignant tumours were found with SUV < 1.5; however there were 3 benign tumours with SUV > 3.5. Between SUV max 2.5 and 3.5 seven benign and six malignant lesions were found. There were four false positive and three false negative scans. The sensitivity for high-grade MPNSTs was 100%. The use of delayed imaging with 4-hour delay has shown some potential; Ferner et al. [10], Warbey et al. [12], and Chirindel et al. [19] all included delayed imaging 4 hours after FDG administration. Warbey et al. [12] were able to demonstrate a statistically significant difference between early and delayed imaging for lesions classified on FDG-PET/CT ( = 0.002) as malignant but not for benign lesions. The mean SUV max for malignant lesions increased from 7.0 on early to 8.1 on delayed imaging; they were able to obtain sensitivity and specificity of 97% and 87%, respectively. Chirindel et al. [19] were unable to replicate these results however having 84% specificity for early versus 81% for delayed imaging.
Benz et al. [13] in 2010 did a combined prospective and retrospective study mean SUV max for MPNSTs which was found to be significantly higher than benign lesions (12.0 + 7.1 versus 3.4 + 1.8, < 0.001). There were two false positive and one false negative scans in this cohort. ROC analysis concluded an optimal SUV max cut-off of 6.1 leading to sensitivity and specificity of 94% and 91%, respectively; this is significantly higher than thresholds that were determined from studies by Warbey et al., Derlin et al.,Salamon et al.,and Chirindel et al. [12,16,18,19] which would lead to several false negative scans. PPV, NPV, and diagnostic accuracy were 89%, 95%, and 93% ( < 0.001), respectively.
International Journal of Surgical Oncology 5 Moharir et al. [14] retrospectively analysed 18 children with NF-1 who underwent plexiform neurofibroma surveillance and revealed a sensitivity and a specificity of 100% and 86%, respectively, with a PPV of 50% and NPV of 100%. This is the first study to evaluate the utility of FDG-PET/CT for children and concludes that although in this trial they used only early imaging 45 minutes after FDG injection they go on to state that early and delayed imaging are now their standard practice due to the findings of Warbey et al. [12]. In addition, Tsai et al. [15] also analysed the paediatric population and found that the mean SUV max of benign and malignant lesions were 2.49 and 7.63, respectively. Using SUV max cut-offs of 3, 4, and 5 yielded a sensitivity of 100%, 100%, and 89% and a specificity of 81%, 94%, and 94%, respectively. Eight of 27 lesions were MPNST and none had SUV max < 4. Of the 16 plexiform neurofibromas 8 were classified as atypical, that is, with histological findings consisting of hypercellularity and hyperchromatic nuclei with the absence of mitotic figures [20]; one of these lesions had SUV max of 6.90. Although atypical neurofibromas can make histological diagnosis difficult, they are classified as benign; they can transform to MPNST; however plexiform neurofibromas also have this ability and either can display varying SUV max . There exists a significant overlap between plexiform neurofibromas, atypical neurofibromas, and MPNSTs on FDG-PET/CT. These findings have obvious correlation with the adult population; however the sample sizes remain small and therefore it becomes difficult to validate these findings. Important to note is that not all lesions with overlapping SUV max are found atypical; a range of SUV max can be associated regardless of histological diagnosis.
SUV max is calculated by dividing the activity concentration within the tissue by the injected activity/body size. There are several factors that can affect the SUV max measurement including biological factors such as body weight/size, blood glucose level, respiratory effort, and the amount of time between injection of radionuclide and scanning. Technical factors can also impact on SUV max such as scanner variability, reconstruction parameters, calibration error, and interuser variability [21]. A trial by Velasquez et al. [22] to determine the reproducibility of SUV max findings in patients with scans taken 7 days apart was able to produce a coefficient of variability of 10%-12% which increased up to 21% when variables such as time from injection to scan were changed.
Salamon et al. [18] provided values for TTL ratio which may provide, in addition to SUV max , a potential method to decrease aforementioned variability by referencing the patient's own tissue uptake of FDG. Salamon et al. [18] were able to show a statistically significant difference in mean SUV max between benign and malignant lesions using the established cut-off of 3.5; they were able to increase specificity from 64.5% using SUV max to 90.3% with TTL ratio [18]. These findings of increasing specificity with the incorporation of TTL have been reproduced; however more data is required in order to impact clinical decision making.

Optimal SUV
Cut-Off. No optimal SUV max cut-off exists in the published literature. The use of SUV max 3.5 as a cut-off was adopted by some trials [16,18]. ROC analysis was performed as noted in Table 2 which shows the optimum SUV max cut-off to maximise sensitivity and specificity. The range varies; however all but one fall between 3.0 and 4.0. This makes interpretation difficult as this level is the grey zone within which both benign and malignant lesions can occur. Further research is needed with addition of delayed imaging in order to better guide clinical decision making.

Novel
Parameters. This paper looks exclusively at the value of SUV max for distinguishing benign and malignant disease, as aforementioned SUV max has its own limitations as a semiquantitative method of analysis and the use of novel parameters may be able to eliminate some of this variability. Novel parameters such as metabolic tumour volume (MTV) and total lesion glycolysis (TLG) have been used which have shown promise but lack adequate evidence to justify routine use. Two retrospective studies were identified showing with statistical significance ( < 0.01) that on patient and lesion basis MPNSTs had a higher rate of metabolic tumour volume (MTV) and total lesion glycolysis (TLG) [23,24]. A further trial was found showing that TLG was a useful prognostic marker when compared to SUV max , TTL ratio, and HI max [25]. This allows avenues for future research to validate the utility of these novel parameters in the assessment of patients with NF-1; currently there is only sparse literature with evidence of selection bias for its role in distinguishing between benign and malignant lesions. Derlin et al. [16] provided values in addition to SUV max , namely, with a Homogeneity Index SUV-incorporating homogeneity of the lesion. They were able to demonstrate a statistically significant increase in specificity between benign and malignant lesions with HI max which provides avenue for further research [16].
It is clear that there is no single ideal method or parameter for noninvasively distinguishing between benign and malignant disease in this cohort of patients. The use of FDG-PET/CT and its various parameters such as SUV max , TTL ratio, MTV, TLG, and other imaging techniques such as contrast enhanced CT and MRI must be used along with clinical findings for individual patients. SUV max does remain the single best parameter available currently with the most support in the literature; this may change in the future with ongoing research.

Bias and Study Designs.
There are certain bias and study designs which must be noted in this systematic review. Studies tended to lean towards patients who had already gone on to have histological analysis meaning they may have had more clinically advanced disease. Consideration must be given to the small sample sizes in the majority of the trials included which makes them much more prone to Type I error. Additionally despite selection bias for patients who have had histological analysis there still exists a large amount of patients whose results are based purely on clinical follow-up which may not provide adequate information. Additionally studies introducing new units of measurement of FDG uptake have not been validated and the results are therefore difficult to interpret.

Conclusion
In summary, MPNST is a life-threatening disease that is known to transform from previously benign lesions in patients with NF-1. FDG-PET/CT has been shown to be a useful, noninvasive diagnostic tool for the assessment of malignant transformation of PNSTs in adults and children. It is able to predict with excellent sensitivity and negative predictive value whether malignant transformation has occurred. It does however have shortcomings in that there is no ideal SUV max cut-off value that has been found and substantiated; although multiple trials have used 3.5 as a cut-off there continues to be several false positive lesions [12,16,18].
The use of delayed imaging has a role in being able to reduce the number of false positive findings; however this has been shown to have technical restraints and would require further trials to validate findings. The use of a normalised SUV whether to the patient's liver uptake or lean body mass such as that performed by Salamon et al. and Chirindel et al. does also have a potential role in differentiating between benign and malignant PNSTs; however the data on this is limited [18,19].
Further prospective trials are required in order to establish an ideal SUV max cut-off, to determine the use of tumourto-liver ratio and other normalised values, and to increase the pool of data available in this area and should be performed in a uniform fashion.

Additional Points
We systematically review the data on the use of FDG-PET/CT for determining malignant transformation of plexiform neurofibromas. FDG-PET/CT although imperfect provides extremely useful information for the management of patients with NF-1 and clinical signs of malignancy.