Reliable Breast Cancer Diagnosis with Deep Learning: DCGAN-Driven Mammogram Synthesis and Validity Assessment

Breast cancer imaging is paramount to quickly detecting and accurately evaluating the disease. Te scarcity of annotated mammogram data presents a signifcant obstacle when building deep learning models that can produce reliable outcomes. Tis paper proposes a novel approach that utilizes deep convolutional generative adversarial networks (DCGANs) to efectively tackle the issue of limited data availability. Te main goal is to produce synthetic mammograms that accurately reproduce the intrinsic patterns observed in real data, enhancing the current dataset. Te proposed synthesis method is supported by thorough experimentation, demonstrating its ability to reproduce diverse viewpoints of the breast accurately. A mean similarity assessment with a standard deviation was performed to evaluate the credibility of the synthesized images and establish the clinical signifcance of the data obtained. A thorough evaluation of the uniformity within each class was conducted, and any deviations from each class’s mean values were measured. Including outlier removal using a specifed threshold is a crucial process element. Tis procedure improves the accuracy level of each image cluster and strengthens the synthetic dataset’s general dependability. Te visualization of the class clustering results highlights the alignment between the produced images and the inherent distribution of the data. After removing outliers, distinct and consistent clusters of homogeneous data points were observed. Te proposed similarity assessment demonstrates noteworthy efectiveness, eliminating redundant and dissimilar images from all classes. Specifcally, there are 505 instances in the normal class, 495 instances in the benign class, and 490 instances in the malignant class out of 600 synthetic mammograms for each class. To check the further validity of the proposed model, human experts visually inspected and validated synthetic images. Tis highlights the efectiveness of our methodology in identifying substantial outliers.


Introduction
Breast cancer (BC) is a popular type of cancer in females that forms in the cells of the breast.It is triggered by the abnormal cell's growth in the breast and divides uncontrollably, forming a tumor.Its early detection is crucial, as it allows for timely treatment, thus saving many lives [1].In the year 2020, a total of 2.26 million new instances of breast cancer were identifed globally, accounting for approximately 11.7% of the overall incidence of cancer.Currently, this form of cancer has become the most prevalent, surpassing lung cancer with a prevalence rate of 11.4%.Certain regions, namely, Australia, Europa (North, West, and South), and North America, appear to be more signifcantly impacted than others [2].Moreover, according to the World Health Organization [3], the incidence of breast cancer is projected to rise from 2.26 million new cases in 2020 to 3.19 million in 2040, refecting a notable rising trend of 41% over the following two decades.Cancer is becoming more prevalent in Pakistan, with 19 million new cancers of all types recorded in 2020.In the context of Pakistan, the annual incidence of breast cancer is predicted to exceed 83,000 cases.Annually, over 40,000 women succumb to this debilitating ailment [4].Medical practitioners use several modalities for early prediction of breast cancer, like mammography, ultrasound, and magnetic resonance imaging (MRI).Mammography is a low-dose X-ray of the breast and is the most common screening tool for breast cancer [5].However, accurately interpreting mammograms is challenging as these images are complex, and detecting abnormalities can be difcult.Human reading of mammograms can sometimes produce false-positive or negative results.Variability in expertise can also lead to inconsistencies in diagnosis and treatment decisions [6].
Te artifcial intelligence community is trying to help with breast cancer detection.Training algorithms on a large mammogram and clinical datasets conducts extensive research.Tese models can recognize breast cancer patterns and accurately analyze mammograms, reducing false positives and negatives and helping radiologists make better diagnoses and treatment decisions.Machine learning (ML) models can standardize analysis and reduce radiologists' expertise [7,8].Machine learning algorithms can identify subtle patterns and features that may not be readily noticeable to human observers and can aid in the early detection of BC.
ML can identify breast cancer, but it has limits.Training data are crucial to such models.Te model's performance may be limited if the training data doesn't include unusual or atypical breast cancer cases.Comprehensive and representative datasets are essential to detecting breast cancer across populations and variations.Tese models may fail to generalize to new data that difers signifcantly from the training data [9][10][11].
Researchers [12] increasingly acknowledge the signifcance of tackling the issue of limited data representation to enhance the performance and generalizability of these models.Te technique of infating existing datasets is widely employed.Data augmentation is a methodology that encompasses the generation of novel training samples by applying diverse transformations to the preexisting data.Te augmentation of mammogram images can encompass various techniques, such as fipping, rotation, scaling, or introducing noise.Tese techniques serve the purpose of enhancing the diversity and variability of the training data.Tis feature enables the model to acquire knowledge from a more extensive array of instances.However, this approach also has limitations.
In conjunction with continuous research and development endeavors, these methodologies strive to tackle the issue of inadequate data representation in machine learning and enhance the efcacy and applicability of models across diverse felds [13][14][15][16].
Te primary objective of this study is to employ a DCGAN for data augmentation and subsequently validate its efectiveness.DCGANs can create synthetic mammograms from minimal real data, as shown in Figure 1.
Te primary motivation is to overcome data shortages, enhance diagnostic precision, and advance breast cancer imaging using generative models to synthesize realistic mammograms.Te generated data could improve deep learning models and aid in providing patients with better care.Te study aspires to improve patients' lives worldwide by improving such models' diagnostic accuracy through more robust data augmentation.Te accuracy of the proposed models helps clinicians detect breast cancer earlier.
Te study's objectives include utilizing DCGANs to create synthetic mammograms to fll the data gap and increase the dataset's diversity and clinical applicability.Enhancing deep learning by exposing it to a more extensive and varied dataset during training can improve the precision of deep learning models for detecting breast cancer.Increasing the number of cases available for comparison and validation through synthetic images will boost radiologists' confdence in their diagnostic abilities.
Section 2 presents recent literature on GAN.Section 3 presents the proposed methodology for mammogram augmentation and subsequent validation.Te results of the study are presented in Section 4. Te conclusion and future work are discussed in Section 5.

Related Work
In the context of mammography, several data augmentation approaches are frequently employed to improve the performance and generalizability of the models.Machine learning has experienced substantial progress in the last decade, primarily due to advancements in deep neural networks.Tese networks have demonstrated exceptional performance in several medical imaging tasks, contributing to the increased popularity of machine learning in this domain.Meanwhile, the generative modelling and data synthesis feld has made signifcant advancements in quality, mainly attributed to the emergence of generative adversarial networks.GANs currently exhibit remarkable capabilities for generating visually realistic images that closely resemble the content of the datasets they were trained on.
Wu et al. [18] conducted research to address the data scarcity and imbalance problem in breast cancer detection using a publicly available dataset from the UK, namely the OPTIMAM Mammography Image Database.Te dataset contains 8282 malignant, 1287 benign, and 16887 normal mammographic images.Tey divided the data into 60% for training, 20% for validation, and 20% for testing.Tey trained a contextual GAN model to augment the dataset with the self-attention mechanism.Tey used both traditional and GAN-based augmentations.Teir GAN-augmented model produced an AUC of 0.846.Te model performs a binary classifcation of normal and malignant.
Desai et al. [19] developed the DCGAN model using a benchmark DDSM dataset.Tey utilized 218 for training and 47 each for testing and validation.Teir experiments reported an accuracy of 78.23% when the model was trained on original images.At the same time, the combination of synthetic and authentic images produced an enhanced accuracy of 87% with an improvement factor of 8.77.Te authors show that GAN is a workable choice for training such models with a data shortage.
Alyaf et al. developed a DCGAN model for breast mass augmentation using a subset of 80000 images from the UKbased OPTIMAM mammography image database (OMI-DB) [17].Te authors demonstrate the performance of a classifer in an imbalanced dataset with and without synthetic data in the experiments.Tey created breast mass patches with 128 × 128 pixel dimensions using a modifed Another study generated breast mammograms with GANs [22].Teir main aim was to detect mammographically occult (MO) cancer in women with dense breasts.Te researchers employed a convolutional neural network (CNN).Te network was trained on processed mammographic images from the Radon cumulative distribution transform (RCDT) 1366 processed mammograms collected from the University of Pittsburgh Medical Center, USA.Tey reported an AUC of 0.77.Te system can identify patients for further screening in the early detection of MOrelated cancer.However, they did not consider benchmark datasets.
Te authors developed a StyleGAN 2 system using 105,948 normal mammograms collected from Asan Medical Center, Korea, from January 2008 to December 2017 [23].Tey evaluated GAN-generated images through Fréchet Inception Distance (FID) equal to 4.383 and the Inception Score of 16.67.Te multiscale structural similarity index measure (MS-SSIM) stood at 0.39, and the average value of the peak signal-to-noise ratio (PSNR) was 31.35.Teir model has performed with reasonable fdelity to real images.Te system was only limited to normal mammographic local images.Te summary of the literature is presented in Table 1.
Tis study presents an innovative approach to addressing the scarcity of annotated mammogram data by employing DCGANs.Tis methodology is adept at generating synthetic mammograms that mirror real-data characteristics with high fdelity.Key contributions of this research are outlined as follows: (i) Te research extensively tests the efectiveness of the DCGAN-based synthesis in accurately replicating various mammographic features, including diverse tissue types, lesion characteristics, and breast views.
Te quality and authenticity of these synthetic images are meticulously evaluated using mean similarity measures and standard deviation analyses, ensuring a rigorous assessment of their realism.
(ii) Te study employs a systematic approach to enhance data precision by identifying and removing outliers.Tis is achieved through a threshold-based outlier removal mechanism, signifcantly bolstering the synthetic dataset's reliability.Te refned dataset demonstrates clinical relevance, as evidenced by its consistency across diferent classes.Applied Computational Intelligence and Soft Computing

Methodology
Tis section describes the comprehensive methodology used to determine the reliability of the dataset and ensure the validity of the generated mammogram classes.Te methodology includes creating mammogram classes with a DCGAN, determining similarity, and removing outliers using a three-fold standard deviation threshold.Te overall methodology is depicted in Figure 2. Tis study aims to evaluate the generated classes' quality methodically and improve the dataset's robustness.To feed data to the network, the mammograms were resized into the same size and format.

DCGAN Architecture.
Te architecture of the DCGAN [25] plays a pivotal role in generating realistic mammogram images.Tis subsection presents a comprehensive overview of its architecture tailored to the mammogram generation task, including detailed descriptions and tables depicting key components.Its general architecture is shown in Figure 3. Dotted arrows show fake mammograms.First, a noise batch z is generated; forward z through Generator (G); forward the real and fake batches through Discriminator D; calculate LD; update D; calculate LG; and update G.
In the diagram, random latent vector samples are taken from z Pz; Pz � N (0, 1) for each training iteration (see step 1 in the above diagram).After being normalized to the range [1,1], this pure-noise batch is sent through G to create a set of fake images (G(z), step 2).As shown in step 3 with the dashed arrows, these fake images are normalized to the interval [0, 1] before passing through D to obtain realism probabilities.In step 4, LD is calculated, and D parameters are updated in step 5.After that, the fake batch is forwarded through D, and LG is calculated in step 6.Backpropagation is done eventually to update the parameters of G in step 7.

Generator Network.
A random noise vector is fed into the generator network, which gradually converts it into synthetic mammogram images.It starts with convolutional layers and then adds nonlinearity with batch normalization and ReLU activation functions.Skip connections preserve key features during the downsampling process; they were inspired by U-Net architectures.Te generator's architecture is summarized in Table 2.

Discriminator Network.
Te primary function of the discriminator network is to discern and diferentiate between authentic mammogram images and artifcially generated ones.Te architecture consists of convolutional layers, followed by batch normalization and LeakyReLU activation functions to introduce nonlinearity.Te discriminator's architecture is summarized in Table 3.
Te training employs adversarial loss functions, such as binary cross-entropy or Wasserstein loss, to simultaneously optimize the generator and discriminator networks.Adam optimizer is utilized for its robustness in handling nonstationary data and complex loss landscapes.

Training Process.
A crucial stage in this research is the network's training process, during which the generator learns to create realistic mammogram images, and the discriminator develops its capacity to tell real images from fake ones.Te key components of the training process are described in this subsection, including the tuning of hyperparameters, loss functions, and convergence monitoring, as shown in Table 4.
Te generator and discriminator networks compete in a two-player minimax game as part of the adversarial training approach.While the discriminator strives to become more accurate in distinguishing real from fake images, the generator seeks to reduce the discriminator's ability to diferentiate between real and synthetic mammogram images.

Loss Function.
During training, the DCGAN uses binary cross-entropy loss as the primary loss function for the generator and discriminator.For the discriminator's real/ fake classifcation, this loss measures the diference between predicted and ground truth labels.Te Wasserstein loss       Figure 8 shows the losses of both the discriminator and generator networks.Figure 9 shows the real and fake images of the proposed model during training.

Mean Similarity Assessment.
After generating synthetic mammograms, all synthetic and original images were mixed class-wise.For each class, the mean similarity is calculated, which provides insight into the consistency and similarity of the generated mammograms within that class.
During validation of the synthetic images in a normal class, 95 images were declared outliers out of a total of 600 images.In the benign class, 105 images were used, while in the malignant class, 110 images were declared outliers as per the similarity score, as shown in Figures 10-12     using the three times standard deviation criterion.More considerable distances indicated more signifcant dissimilarity, while smaller suggested better alignment with the class mean.
In Figure 13, distinct and coherent clusters of similar data points were evident after removing outliers.Tis highlighted the efectiveness of the proposed approach in forming meaningful clusters.Distance-based validation  methods provided a robust means of quantifying the authenticity of the synthetic mammograms, improving accuracy, and strengthening the of the generated data for breast cancer imaging applications.

Validation from Human Experts.
Considering how realistic some of the DCGAN-generated images look, we asked three medical experts with more than 10 years of experience in radiology and mammography to classify synthetic and real images.Each radiologist was shown 80 images of a 50/50 mixture of real and synthetic images and was asked to rank them based only on their visual appreciation.Te experts achieved an average accuracy of only 68%, thus showing how visually accurate the generated images are.In the Expert Panel Review phase, a group of radiologists with extensive experience in mammography evaluated the synthetic mammogram images.Tis panel was carefully selected based on their clinical expertise and familiarity with mammographic interpretation.Tey conducted a detailed assessment of each synthetic image, focusing on critical diagnostic features such as tissue density, lesion characterization, and calcifcations or other anomalies indicative of potential pathology.Teir assessment aimed to determine the realism and diagnostic accuracy of the synthetic images, comparing them to actual mammograms.Te radiologists' feedback provided valuable insights into the clinical viability of the synthetic images, ensuring that they met the standards required for efective diagnostic use in a clinical setting.

Conclusions and Future Work
Tis research aims to give an in-depth study into the utilization of DCGAN for the generation of mammograms and the following validation.Te research was centered on three classes of mammography, and it utilized a statistical methodology that involved the three times standard deviation criterion in examining the mean similarity of each class and the distances of individual data points from their respective means.Te fndings demonstrate that the pronetwork can successfully generate synthetic mammograms that exhibit traits and properties comparable to real mammograms.As a result of rigorous training, the network was able to gain the capability to make synthetic images of high quality that capture the one-of-a-kind patterns and structures that are characteristic of each class.Tis was demonstrated by the synthetic images produced during the various training epochs.Te calculation of mean similarity ofered insights into the consistency and similarity of the generated mammograms within each class, further highlighting the network's capacity to capture class-specifc properties.Tese insights are evident from the calculation of mean similarity.Te statistical validation strategy relied on calculating distances between mammograms to ensure the generated mammograms were genuine.Some of the generated images were also validated by the human radiologist, confrming the authenticity of the proposed model.Te research provided a reliable approach for evaluating dissimilarity and alignment by frst estimating the level of variation from class means and then utilizing the three times standard deviation criterion as the measuring stick.Notably, eliminating outliers showed cohesive and distinct clusters of similar data points, confrming that the strategy efectively produces meaningful clusters.We plan to test more datasets with more GAN architectures in the future.

2
Applied Computational Intelligence and Soft Computing version of DCGAN.GAN augmentation was compared to traditional augmentation.Te results show that using DCGANs with fipping augmentation improves the F1 score by up to 0.09 compared to the original mammographic images.Te job can be expanded to include other similar tasks.Teir work is limited to small mammogram patches.Shen et al.[20] developed a GAN-based system using a benchmark DDSM dataset and a local dataset collected from Nanfang Hospital, China.Te study aimed to address the issue of limited data in medical image analysis by designing a model to generate labelled images based on contextual information within the breast mammograms.Te model was evaluated, and the results showed that their augmentation technique increased the diversity of the dataset and achieved an improvement of 5.03% in the detection rate.Te model is a viable option for generating labelled breast images.In[21], they proposed a deep learning-based mammogram recognition model.Te model performs a special autoencoder-generative adversarial network (AGAN) for data augmentation.Te generator produces additional images in a perfect way for training the model.Te fnal set of original and generated images is given as input to the CNN for classifcation.A total of 11,218 ROIs of mammograms from DDSM were used in the experiments.Tey reported an average accuracy in detecting abnormal vs healthy cases of 89.71%.Te specifcity was 80.58%, while the sensitivity and AUC were 93.54% and 0.9410, respectively.Te work's main contribution was its novelty in its data augmentation compared to the other deep learning methods.Te proposed model AGAN is learned only on normal data.Te model does not consider other mammographic datasets.

(
iii) Te reliability of the proposed model is further corroborated through visual validation conducted by expert radiologists.Teir professional assessment confrms the clinical accuracy and utility of synthetic mammograms.(iv) Te study showcases the consistency of the synthetic dataset through detailed visualizations of class clustering.Tese visualizations highlight the congruence between the generated mammograms and the real data distribution.Te substantial number of images from each class passing the similarity assessment underscores the success of the proposed validation mechanism.

Figures 4 (
Figures 4(a) and 4(b) represent the images generated during the initial phases.Initially, the training process takes place over random noise.Figure 5(a) represents the synthesized images from epoch 2 during training, while Figure 5(b) shows synthesized images from epoch 3 during the training process of the DCGAN.Figure 6(a) shows synthesized images from epoch 45 during the training process, while Figure 6(b) shows synthesized images from epoch 50 during the training of DCGAN.Figure 7(a) represents synthesized images from epoch 99 during the training of the DCGAN.Te images are closer to the real ones.Figure 7(b) Shows fnal synthesized images from epoch 100 during training.Tese are the fnest images of the proposed model during the entire training.Figure8shows the losses of both the discriminator and generator networks.Figure9shows the real and fake images of the proposed model during training.

Figure 5 (
Figures 4(a) and 4(b) represent the images generated during the initial phases.Initially, the training process takes place over random noise.Figure 5(a) represents the synthesized images from epoch 2 during training, while Figure 5(b) shows synthesized images from epoch 3 during the training process of the DCGAN.Figure 6(a) shows synthesized images from epoch 45 during the training process, while Figure 6(b) shows synthesized images from epoch 50 during the training of DCGAN.Figure 7(a) represents synthesized images from epoch 99 during the training of the DCGAN.Te images are closer to the real ones.Figure 7(b) Shows fnal synthesized images from epoch 100 during training.Tese are the fnest images of the proposed model during the entire training.Figure8shows the losses of both the discriminator and generator networks.Figure9shows the real and fake images of the proposed model during training.

Figure 6 (
Figures 4(a) and 4(b) represent the images generated during the initial phases.Initially, the training process takes place over random noise.Figure 5(a) represents the synthesized images from epoch 2 during training, while Figure 5(b) shows synthesized images from epoch 3 during the training process of the DCGAN.Figure 6(a) shows synthesized images from epoch 45 during the training process, while Figure 6(b) shows synthesized images from epoch 50 during the training of DCGAN.Figure 7(a) represents synthesized images from epoch 99 during the training of the DCGAN.Te images are closer to the real ones.Figure 7(b) Shows fnal synthesized images from epoch 100 during training.Tese are the fnest images of the proposed model during the entire training.Figure8shows the losses of both the discriminator and generator networks.Figure9shows the real and fake images of the proposed model during training.

Figure 7 (
Figures 4(a) and 4(b) represent the images generated during the initial phases.Initially, the training process takes place over random noise.Figure 5(a) represents the synthesized images from epoch 2 during training, while Figure 5(b) shows synthesized images from epoch 3 during the training process of the DCGAN.Figure 6(a) shows synthesized images from epoch 45 during the training process, while Figure 6(b) shows synthesized images from epoch 50 during the training of DCGAN.Figure 7(a) represents synthesized images from epoch 99 during the training of the DCGAN.Te images are closer to the real ones.Figure 7(b) Shows fnal synthesized images from epoch 100 during training.Tese are the fnest images of the proposed model during the entire training.Figure8shows the losses of both the discriminator and generator networks.Figure9shows the real and fake images of the proposed model during training.

Figure 7 (
Figures 4(a) and 4(b) represent the images generated during the initial phases.Initially, the training process takes place over random noise.Figure 5(a) represents the synthesized images from epoch 2 during training, while Figure 5(b) shows synthesized images from epoch 3 during the training process of the DCGAN.Figure 6(a) shows synthesized images from epoch 45 during the training process, while Figure 6(b) shows synthesized images from epoch 50 during the training of DCGAN.Figure 7(a) represents synthesized images from epoch 99 during the training of the DCGAN.Te images are closer to the real ones.Figure 7(b) Shows fnal synthesized images from epoch 100 during training.Tese are the fnest images of the proposed model during the entire training.Figure8shows the losses of both the discriminator and generator networks.Figure9shows the real and fake images of the proposed model during training. .

Figure 8 :
Figure 8: Discriminator and generator network losses during training.

Figure 11 :
Figure 11: (a) Presents similarity score of benign class.(b) Presents similarity score with outliers in red of benign class.(c) Presents similarity score of benign class without outliers.

Figure 12 :Figure 13 :
Figure 12: (a) Presents the similarity score of the malignant class.(b) Presents similarity score with outliers in red of malignant class.(c) Presents the similarity score of the malignant class without outliers.

Table 1 :
Comparison of various GAN-based Mammogram Augmentation Techniques.