Association of the Differences in Average Glandular Dose with Breast Cancer Risk

Objectives. To compare the differences in normalized average glandular dose (NAGD) between the breasts of healthy subjects and those of cancer patients and to determine if the NAGD difference is associated with breast cancer risk and improves breast cancer classification. Materials and Methods. Craniocaudal view and mediolateral view full-field digital mammography (FFDM) images were obtained from 1682 healthy subjects whose breasts were categorized as Breast Imaging-Reporting and Data System (BI-RADS) I or II and from 811 biopsy-confirmed unilateral breast cancer patients whose breasts on the contralateral side were category I or II. Both populations were randomized into training and test sets. Multivariate logistic regression analysis was used to build the breast cancer risk assessment model, and the area under the receiver operating characteristic curve (
 
 
 
 A
 
 
 z
 
 
 
 ) was used to evaluate the model. Twenty-two breast cancer patients who were originally categorized as BI-RADS I or II for both breasts, but were diagnosed with unilateral biopsy-confirmed breast cancer subsequently, were included to validate the model. Results. The NAGD differences in both FFDM images between tumor-bearing breasts and the healthy breasts of patients were significantly higher than those in healthy subjects (
 
 P
 <
 0.001
 
 ). The model with NAGD differences had a higher 
 
 
 
 A
 
 
 z
 
 
 
 value than the model without NAGD differences. While there was no NAGD differences between originally healthy breasts of breast cancer patients, significant NAGD differences between now tumor-bearing breasts and the then previously healthy breasts were found in both FFDM images. Conclusions. NAGD differences between both breasts can be included in the breast cancer risk assessment model to evaluate breast cancer risk.


Introduction
Breast cancer is the leading cause of death among women globally [1,2]. In China, breast cancer incidence in recent years has been the predominant contributor to overall cancer incidence in women [3]. Full-field digital mammography (FFDM) is the most effective method for breast cancer screening [4][5][6]. With the advancement of this technology, early cancer diagnosis has been largely improved because of its ability to detect small-sized lesions [7,8].
Previous studies have shown that identifying women at high risk of developing breast cancer can improve their survival rate and reduce the mortality rate of breast cancer [9,10]. A breast cancer risk assessment model was first proposed by Gail et al. [11,12] in 1989 to identify women with a higher risk for breast cancer for further screening and possible preventive therapies. A logistic regression model was later developed and validated to predict the probability of developing breast cancer for a woman based on the odds ratio (OR) of classical risk factors [13]. A detailed family history was also incorporated as an independent factor for a breast cancer risk assessment model [14]. However, because these models are solely based on demographic information, they have low positive predictive values; thus, a breast cancer risk assessment model that includes more predictive cancer risk factors is clinically needed.
Image analysis of FFDM images has been widely used to determine women at high risk for breast cancer. A computerized mammographic parenchymal pattern measurement was developed to evaluate the texture feature difference between healthy subjects and cancer patients [10]. The convolutional neural networks that extracted parenchymal features from FFDM images were able to perform much better than the conventional texture analysis to distinguish cancer risk populations through extensive application of deep learning in imaging [15]. Furthermore, inclusion of radiomic texture features provides a better classification for differentiating between malignant and benign lesions [16]. Quantitative parenchymal patterns are associated with breast cancer risk and have higher discriminatory capability of imagedetectable power than classical risk factors used in existing prediction models [17]. This finding is consistent with previous studies where the pattern of breast parenchyma is changed during breast cancer development [10,15,16,18].
The average glandular dose (AGD) increases with breast thickness but decreases with breast density [19]. Several studies have taken breast thickness and breast density into consideration in FFDM images when AGD is measured [19][20][21]. Previously proposed breast cancer risk models incorporated various legitimate risk factors. However, to the best of our knowledge, the normalized average glandular dose (NAGD) difference between the breasts of healthy subjects and breast cancer patients has not been investigated.
Because breast cancer is a progressive disease, we hypothesized that the changes in breast parenchyma might lead to an increase in NAGD. Therefore, this study compared the NAGD differences between the breasts of healthy subjects and breast cancer patients, evaluated its association with breast cancer risk, and assessed if the addition of NAGD difference as a risk factor could improve cancer classification.

Study
Population. This retrospective study was approved by the institutional review board. The necessity to obtain written informed consent was waived. Women who underwent FFDM screening examinations at Nanfang Hospital, Southern Medical University, from 2014 through 2017 were recruited for participation in our study. Information on classical breast cancer risk factors including age, age at menarche, menopausal status, parity status, age at first birth, and first-degree family history of breast cancer were routinely collected by means of a self-administered questionnaire for every subject at the time of FFDM screening examinations. Breast density was determined from the screening report of radiologists.
The craniocaudal (CC) and mediolateral (MLO) view images (Hologic Selenia Dimensions, Marlborough, MA, USA) of the breasts from patients, who had undergone FFDM screening examinations and biopsy at the hospital, were sequentially acquired. The unilateral breast of eligible patients had biopsy-confirmed breast cancer, and the contralateral breast was categorized as I (negative) or II (definitely benign) based on Breast Imaging-Reporting and Data System (BI-RADS) category. CC-view and MLO-view FFDM images of healthy subjects were sequentially collected at the hospital. Eligible subjects included women of any age who had normal screening mammograms with BI-RADS category I or II for both breasts.
The selection of exposure settings depended on automatic exposure control (AEC) systems. Two experienced radiologists (more than 15 years in breast imaging diagnosis) verified the final BI-RADS category for every subject. Exclu-sion criteria were women who had prior breast cancer and breast implants or had missing demographic information.
We collected two datasets for development and evaluation of the breast cancer risk model. The first dataset consisted of 1869 subjects (1261 healthy subjects and 608 cancer patients) and was used for risk model development.

BioMed Research International
The second set consisting of 624 subjects (421 healthy subjects and 203 cancer patients) was used for risk model validation.
Twenty-two independent breast cancer cohorts, which were originally categorized as I or II by the BI-RADS system for both breasts during FFDM screening examinations but were diagnosed with unilateral biopsy-proved breast cancer at least 1 year later, were included for model validation. The validation controls included 187 healthy subjects who had cancer-free follow-up for at least 3 years.

Data Analysis.
In all subjects, the AGD of each CC-view and MLO-view FFDM image was extracted by MATLAB 2014a software from the DICOM headers. As AGD was associated with breast thickness, we normalized the AGD by thickness: The NAGD differences between both breasts of the healthy subjects were calculated by subtracting the value at the left side from the one at the right side due to left-right breast symmetry. However, the NAGD difference between both breasts of cancer patients was calculated by subtracting the value at the normal side from the one at the cancer side.

Statistical Analysis.
Multivariate logistic regression analysis was used to determine the association between the variables and breast cancer risk. OR was used to assess the association between breast cancer risk and NAGD difference. Risk factors such as age, age at menarche, menopausal status, parity status, breast density, and first-degree family history of breast cancer were included for association analysis. Variables that were initially found to significantly improve prediction were sequentially used to build a model for breast cancer classification. The model performance of cancer classification was measured using the area under receiver operating characteristic (ROC) curve (A z ). To ensure normality of the recorded data, the Kolmogorov-Smirnov test was used to assess all NAGD differences. The unpaired Student t-test was used to assess the NAGD differences between healthy subjects and cancer patients. All statistical analyses were performed using SPSS 19 (SPSS Inc., Chicago, IL, USA), and the level of significance was set at 0.05.

Results
To investigate whether NAGD differences between both breasts of healthy cohorts and patients with breast cancer were associated with breast cancer risk, we initially built a training set for a breast cancer risk assessment model, including 1869 subjects (1261 healthy subjects and 608 cancer patients) and 624 subjects (421 healthy subjects and 203 cancer patients) as a test set to validate the established model. The characteristics of the study population are shown in Table 1. For the training set, the mean age of the cancer patients was 49:1 ± 10:5 years, which was statistically higher than the 46:4 ± 7:4 years of healthy subjects. For the test set, the mean age of cancer patients and healthy subjects was 50:8 ± 10:6 years and 48:4 ± 7:0 years, respectively. The BI-RADS density of healthy subjects and cancer patients was mainly in the heterogeneous category. Menopausal status, parity, and first-degree family history of breast cancer were treated as bicategorical variables in the analysis. The analysis of Kolmogorov-Smirnov test showed that all the variables were within Gaussian distribution (P > 0:05) in both the training and test sets. Figure 1 shows the NAGD values of the CC-view and MLO-view FFDM images from healthy subjects (Figure 1(a)) and cancer patients (Figure 1(b)). The NAGD differences between both breasts of cancer patients were significantly higher than those of healthy cohorts (Figure 1(c)). The correlations between NAGD differences of CC-view and MLO-view FFDM images with other confounding factors are summarized in Tables 2 and 3. Age and menopausal status were weakly associated with NAGD differences in both CCview and MLO-view FFDM images, but this difference was statistically significant.
The area under the ROC curve (A z ) for age, menopausal status, and NAGD differences in the CC-view and MLO-view FFDM images, which was calculated for their effectiveness in  3 BioMed Research International differentiating cancer patients from healthy subjects, was 0:58 ± 0:01, 0:58 ± 0:01, 0:70 ± 0:01, and 0:72 ± 0:01, respectively ( Figure 2). The difference in A z values between age and menopausal status did not achieve significance (P > 0:05). However, the A z value for the NAGD differences in CCview and MLO-view FFDM images was significantly higher than age and menopausal status (P < 0:05).
Although the OR for the NAGD differences in CC-view and MLO-view FFDM images was reduced after adjusting for the effects of other variables, the association between NAGD differences and breast cancer risk persisted in the multivariate analysis (Table 4). The NAGD differences in CC-view (P < 0:001) and MLO-view (P < 0:001) FFDM images, age (P = 0:014), menopausal status (P = 0:002), age at first birth (P = 0:005), and parity (P = 0:001) were all factors that significantly contributed to the prediction of breast cancer risk. Nonsignificant risk factors including age at menarche, family history, and breast density were also included in the model as they improved the accuracy of the proposed model. The logistic regression model with all nine factors listed in Table 4 had an A z value of 0:77 ± 0:01 for the training set and an A z value of 0:75 ± 0:02 for the test set (Figures 3(a) and 3(b)). The model without NAGD differences had an A z value of 0:61 ± 0:01 and 0:56 ± 0:02 for the training set and the test set, respectively. The breast cancer risk assessment model with NAGD differences as an additional factor was significantly improved compared to the model without NAGD differences (P < 0:001). Figure 4 shows the NAGD difference in CC-view and MLO-view FFDM images between the breasts of healthy subjects and cancer patients at the initial healthy state and later follow-up state where one breast of the cancer patients had a tumor. Expectedly, no significant NAGD differences were found in the follow-up CC-view (P = 0:34) and MLO-view (P = 0:97) FFDM images of healthy subjects. However, the NAGD differences in both CC-view (P = 0:035) and MLOview (P = 0:043) FFDM images between both breasts of the    .33 * The univariate analysis was used to examine the association between the NAGD differences and breast cancer risk. † The multivariate analysis was used to examine the independent contribution of breast cancer risk. 4 BioMed Research International cancer patients were significantly higher than those between their previously healthy breasts.

Discussion
In this study, we found significantly higher NAGD differences in CC-view and MLO-view FFDM images between tumor-bearing and healthy breasts of patients than those in healthy subjects. The logistic regression model with NAGD differences had a higher A z value than the model without NAGD differences. Our results showed that the established breast cancer risk assessment model including NAGD differences between both breasts could be clinically beneficial to evaluate breast cancer risk for patients. Breast cancer is the most common malignant cancer in women, so predicting the risk or likelihood of individual woman that will develop breast cancer in the future is an important strategy to reduce cancer death [22,23]. Previous studies have found a high correlation between parenchymal texture features extracted from the left and right breast due to left-right breast symmetry [24,25]. Zheng et al. [26] used the feature of computed bilateral mammographic density asymmetry in the sequential screening examinations to classify low-and high-risk subjects who could potentially develop breast cancer. In our study, the finding that the NAGD difference between both breasts of cancer patients was significantly higher than that of healthy subjects in both CC-view and MLO-view FFDM images suggested parenchymal changes on cancer side of the breast after tumors develop and that the changes in breast parenchyma might lead to an increase in NAGD.
Breast density, age, and menopausal status are strong risk factors associated with breast cancer in previous risk assessment studies [22,27]. However, we found that the NAGD 5 BioMed Research International differences did not show a strong correlation with breast density (Tables 2 and 3). In addition, the correlation between NAGD differences and other classical risk factors was also low. These findings suggested that the NAGD difference between both breasts could serve as an independent risk factor and is not associated with the established breast cancer risk factors. In this study, we used the ROC method to compare the ability of the NAGD differences in CC-view and MLO-view FFDM images to differentiate healthy subjects and cancer patients with regard to patients' age and menopausal status (Figure 2). The results showed that the NAGD differences had a significantly higher accuracy for cancer classification than age (P < 0:05) and menopausal status (P < 0:05).
In this study, we built a logistic regression model that incorporated demographic information and NAGD differences in CC-view and MLO-view FFDM images in large healthy and cancer cohorts and also validated the model using independent healthy and cancer subjects. The analysis of follow-up healthy subjects and cancer patients indicated that the NAGD difference between both breasts was positively associated with breast cancer risk. When the NAGD differences between both breasts increased, the breast cancer risk increased before the lesion became visible on FFDM images.
It is worth noting that this study had some limitations: the follow-up sample size was small; the robustness of the model has yet to be demonstrated with other independent test datasets; and the association of the NAGD differences with other risk factors remains to be investigated. In conclusion, our study demonstrates that the NAGD difference between both breasts is a promising risk factor for differentiating cancer patients from healthy subjects. Furthermore, NAGD difference is strongly associated with breast cancer risk and weakly with other known classical risk factors, suggesting that it has the potential to serve as an independent factor in a breast cancer risk assessment model.

Conclusions
We conclude that NAGD differences between both breasts show promise as an independent risk factor and can be included in the breast cancer assessment model. When NAGD differences between both breasts increase, the breast cancer risk increases before the lesion is readily perceived by human eyes on FFDM images. For future work, a multicenter study is needed to improve prediction accuracy and robustness of the model.

Data Availability
The datasets generated and analyzed during the present study are available from the corresponding author on reasonable request.