Automated detection and segmentation is a prerequisite for the deployment of image-based secondary analyses, especially for lung tumors. However, currently only applications for lung nodules ≤3 cm exist. Therefore, we tested the performance of a fully automated AI-based lung nodule algorithm for detection and 3D segmentation of primary lung tumors in the context of tumor staging using the CT component of FDG-PET/CT and including all T-categories (T1–T4). FDG-PET/CTs of 320 patients with histologically confirmed lung cancer performed between 01/2010 and 06/2016 were selected. First, the main primary lung tumor within each scan was manually segmented using the CT component of the PET/CTs as reference. Second, the CT series were transferred to a platform with AI-based algorithms trained on chest CTs for detection and segmentation of lung nodules. Detection and segmentation performance were analyzed. Factors influencing detection rates were explored with binominal logistic regression and radiomic analysis. We also processed 94 PET/CTs negative for pulmonary nodules to investigate frequency and reasons of false-positive findings. The ratio of detected tumors was best in the T1-category (90.4%) and decreased continuously: T2 (70.8%), T3 (29.4%), and T4 (8.8%). Tumor contact with the pleura was a strong predictor of misdetection. Segmentation performance was excellent for T1 tumors (
Failure to detect lung cancer on imaging studies is a very common reason for malpractice suits [
The diagnostic task of imaging in lung cancer, however, does not end with tumor detection. Tumor staging using 18F-fluorodeoxyglucose- (FDG-) PET/CT as the standard of care forms an integral part of the clinical diagnostic workup of patients with lung cancer [
Sexauer et al. have shown that manual annotation and segmentation of lung tumors is feasible, but tumor stage and lesion size and count correlate significantly with segmentation time [
It was thus the aim of this study to evaluate the performance of a fully automated computer-assisted detection and 3D segmentation algorithm that was initially designed for lung nodule detection and segmentation in the context of tumor staging. This was done using the CT component of FDG-PET/CT studies of a patient cohort with histologically proven primary lung tumors from all T-categories.
This study was conducted under the provisions of the appropriate Swiss regional ethics committee (
We compiled two datasets using an in-house-developed Radiology Information System/Picture Archiving and Communication System (RIS/PACS) search engine: First, we retrospectively identified 18F-fluorodeoxyglucose- (FDG-) PET/CTs with histologically proven primary lung cancer that were acquired at our institution between 01/2010 and 06/2016. Selection criteria were protocol name, time period, and verified tumor histology according to our pathology archive. This resulted in 320 PET/CTs (lung tumor population). Second, for the creation of a dataset with exams not containing pulmonary nodules, appropriate PET/CTs were selected with the criteria protocol name, time period (01/2017–12/2018), and the presence of the text string “no pulmonary nodules” in the clinically approved reports. This resulted in 92 PET/CTs (nodule negative population). The study workflow is displayed in Figure
Study workflow for (a) lung tumor population and (b) nodule negative population.
PET/CT examinations were performed on two integrated PET/CT systems: on a Discovery STE with 16-slice CT (GE Healthcare, Chalfont St Giles, UK) from 01/2008 to 11/2015 and on a Biograph mCT-X RT Pro Edition with 128-slice CT (Siemens Healthineers, Erlangen, Germany) from 12/2015 to 12/2016. Scans were obtained 1 hour after intravenous injection of 5 MBq FDG/kg body weight at glycemic levels below 10 mmol/L and previous fasting for at least 6 h. The CT component of the combined PET/CT examination was acquired with the following parameters: Discovery STE: slice thickness 3 mm, i50f kernel, X-ray tube voltage 120 kVp (SD: 0 kVp), exposure 80 mAs (SD: 15 mAs), CTDIvol 5.8 mGy (SD: 1.7 mGy), and DLP 536 mGy
Manual tumor segmentations with reference to the clinically approved report were performed as previously described [
The transversal 3 mm low-dose CT series of the PET/CTs with histologically proven primary lung tumor (
The output of the AI algorithm pipeline was the transversal chest CT component of the PET/CT with overlays for lung lobe boundaries and tumor boundaries of detected tumors. This output series also contained specifications of volume (VolumeAI), 2D diameter, and location (lung lobe) for every detected tumor and served as the index test. The reference standard was the CT component of the PET/CT for detection and the volumes that were calculated from the 3D tumor masks that resulted from the manual image segmentation process (ground truth volumes: VolumeGT). For each case, the segmented tumor was visually correlated with the output series of the algorithm and it was recorded whether the tumor was detected or not. The correctness of the indication of tumor location (lung lobe) was checked. We additionally established whether a lesion contacted parietal pleura or not by consensus reading (A. S. and T. W.). Finally, we reviewed the output series of the nodule negative population to describe numbers of and reasons for false-positive findings.
Statistical analysis was performed using IBM SPSS Statistics for Windows, Version 22.0 (IBM Corp., Armonk, NY). Scatterplots and graphs were created with JMP, Version 14.2 (SAS Institute Inc., Cary, NC). For descriptive analyses of continuous data, we calculated the mean and standard deviations. To test for association between two or more categorical variables, we used the chi-squared test. To test for statistical differences among the means of two or more groups, we conducted a one-way analysis of variance. Normal distribution was assessed with the Shapiro–Wilk test, histograms, and Q-Q plots. To analyze the influence of histology, location, pleural contact, and maximal axial diameter on detection rates, we performed a binomial logistic regression with detection (yes/no) as the dependent variable. In this model, the largest histology subgroup and the most common location regarding the lung lobe (for location) were set as reference categories of the categorical variables. For the analysis of segmentation performance, all tumors with automatically calculated tumor volumes (VolumeAI) were considered (=all tumors detected). We used the Pearson correlation coefficient to assess the relationship between VolumeGT and VolumeAI.
To elucidate the influence of textual features on detection rates, we extracted 200 radiomic features with Pyradiomics version 2.1.0 [
The mean patient age was 66.7 years (SD: 10.7 years). 70.3% of the patients were male (
Distribution of the lung tumor histology subtypes.
Tumor histology |
|
% |
---|---|---|
Adenocarcinoma (AC) | 174 | 54.2 |
Squamous cell carcinoma (SCC) | 79 | 24.6 |
NSCLC not specified (NOS) | 25 | 7.8 |
SCLC | 15 | 4.7 |
Other |
28 | 8.7 |
The attribution of a lesion to the corresponding lung lobe was correct in 100% of the detected lesions. Detection rates differed significantly across T-categories and declined towards advanced tumors: 90.4% for T1 (75 of 83), 70.8% for T2 (75 of 106), 29.4% for T3 (15 of 51), and 8.8% for T4 (7 of 80). This detection decline is also reflected in Figure
Tumors and their detection status. Tumors detected by the algorithm are visualized in dark blue and missed tumors in light blue. (a) Histogram per T-category. (b) Detection of tumors depending on the ground truth volumes. Every dot represents one tumor.
Binominal logistic regression conducted to explore factors that influence detection rates showed that tumors with a larger maximal axial diameter and tumors with pleural contact were more likely to be missed by the detection algorithm (both
Results of the binomial logistic regression.
Independent variables |
|
Exp( |
---|---|---|
Histology subtype | ||
Reference: adenocarcinoma | ||
(1) Squamous cell carcinoma |
|
|
(2) NSCLC (NOS) | 0.181 | 0.443 (0.134–1.461) |
(3) SCLC |
|
|
(4) Others | 0.653 | 0.765 (0.237–2.464) |
Location (lobes) | ||
Reference: right upper lobe | ||
(1) Middle lobe | 0.350 | 0.499 (0.116–2.145) |
(2) Right lower lobe | 0.495 | 1.446 (0.502–4.167) |
(3) Left upper lobe | 0.905 | 1.054 (0.448–2.480) |
(4) Left lower lobe | 0.902 | 0.943 (0.369–2.408) |
Pleural contact |
|
|
Maximal axial diameter |
|
|
Detection (yes/no) was set as dependent variable. Independent variables: histology (categorial), location (categorial), pleural contact (dichotomous), and maximal axial diameter (continuous). Exp(
Table
Results of the radiomic analysis with features from Pyradiomics.
Selected feature | Lasso coefficient | Youden cutoff |
---|---|---|
CT_glrlm_GrayLevelNonUniformityN | −1.0776312 | 0.1166608 |
PET_firstorder_10Percentile | −0.0344698 | 1.7492108 |
PET_firstorder_Maximum | −0.0022762 | 6.9905767 |
PET_gldm_DependenceEntropy | 0.0716689 | 2.2174546 |
shape_Maximum2DdiameterSlice | −0.0043233 | 32.866422 |
shape_Sphericity | 0.2268932 | 0.4293948 |
All tumors detected by the algorithm were included in the second step of our analysis that investigated the segmentation performance (all:
Segmented ground truth volumes (VolumeGT) in cm3 (
Examples for (a) manual segmentation of a T1 tumor without pleural contact with (b) corresponding excellent segmentation by the algorithm, (c) an incompletely segmented T3 lesion with pleural attachment, and (d) a completely missed T4 lesion with infiltration of the chest wall.
Mean age of the patients was 63.2 years (SD: 16.6 years). There were 60.6% males (
The evaluated AI-driven algorithm allows for excellent detection and segmentation of pulmonary T1 lesions (detection rate: 90.4%; excellent correlation of VolumeAI and VolumeGT:
The first step of CAD systems is to detect the location of lesions in medical images [
Our radiomics analysis revealed further features that influence the detection rates: a finer, less heterogeneous and rounder texture was associated with better detection. While the utility of texture analysis for the differentiation of benign vs. malign lung lesions [
After detection, segmentation of lung lesions is the subsequent step that, if done correctly, paves the way to a plethora of secondary analyses that are currently developed within the context of AI, radiomics, and personalized medicine. In this context, Owens et al. compared contours of 10 lung tumors ranging from 1.1 cm3 to 10.5 cm3 defined by human readers in consensus, corresponding to our categories T1 and T2, with 2 semiautomatic segmentation methods: Lesion Sizing Toolkit (LSTK) and GrowCut [
According to current guidelines, FDG-PET/CT is considered the standard imaging procedure of choice for noninvasive staging of lung cancer [
There are several limitations of our work. First, manual segmentation was performed by two readers in random order without consensus or double reading. Both, consensus and double reading are time-consuming tasks and therefore not practicable in this study with a total of 320 lesions. Second, the assessment of segmentation quality was based on comparison of the automatically calculated tumor volumes with ground truth volumes. More advanced methods like Dice similarity coefficients or Hausdorff distances could not be applied since space coordinates were not accessible in the manually created tumor masks. Third, for the creation of manual tumor masks, the FDG-PET component was considered whenever tumor borders could not be well delineated on the CT component, while automated tumor detection was performed only on the CT component. Inclusion of the information contained in the PET components could possibly increase detection rates and segmentation quality. Fourth, the analysis was conducted in two steps: detection and segmentation. Due to lower detection rates for more advanced tumors, a selection bias in step two of the analysis could positively influence segmentation performance in this group.
In conclusion, the tested algorithm facilitates a fast and reliable detection and 3D segmentation of pulmonary T1 and T2 tumors that also works well on the CT component of PET/CTs acquired in free breathing and with a slice thickness of 3 mm. The detection and segmentation of more advanced lung tumors is currently imprecise due to the conception of the algorithm for lung nodules. Consequently, there is still an unmet need for CAD applications that also cope with the more complex segmentation tasks required in the context of lung cancer staging. Future efforts must therefore focus on this collective to facilitate segmentation of all tumor types and sizes and bridge the gap between CAD applications for screening and staging of lung cancer.
The volumetric data are all published within this manuscript. A large part of the data are patient data and thus confidential. Upon request, a minimal anonymized dataset will be available to interested researchers.
The authors declare that they have no conflicts of interest.
We want to thank Victor Parmar for proofreading the article. The manual segmentation masks were acquired during the project “LungStage—Computer Aided Staging of Non-Small Cell Lung Cancer (NSCLC),” funded by CTI (Commission for Technology and Innovation) (Project no. 25280.1).