A Custom Backbone UNet Framework with DCGAN Augmentation for Efficient Segmentation of Leaf Spot Diseases in Jasmine Plant

Leaf blight spot disease, caused by bacteria and fungi, poses a considerable threat to commercial plants, manifesting as yellow to brown color spots on the leaves and potentially leading to plant mortality and reduced agricultural productivity. Te susceptibility of jasmine plants to this disease emphasizes the necessity for efective detection methods. In this study, we harness the power of a deep convolutional generative adversarial network (DCGAN) to generate a dataset of jasmine plant leaf disease images. Leveraging the capabilities of DCGAN, we curate a dataset comprising 10,000 images with two distinct classes specifcally designed for segmentation applications. To evaluate the efectiveness of DCGAN-based generation, we propose and assess a novel loss function. For accurate segmentation of the leaf disease, we utilize a UNet architecture with a custom backbone based on the MobileNetV4 CNN. Te proposed segmentation model yields an average pixel accuracy of 0.91 and an mIoU (mean intersection over union) of 0.95. Furthermore, we explore diferent UNet-based segmentation approaches and evaluate the performance of various backbones to assess their efectiveness. By leveraging deep learning techniques, including DCGAN for dataset generation and the UNet framework for precise segmentation, we signifcantly contribute to the development of efective methods for detecting and segmenting leaf diseases in jasmine plants.


Introduction
In recent times, there has been a notable rise in the occurrence of plant diseases caused by microorganisms such as bacteria, fungi, and viruses [1] in plants, animals, and humans.Tese infections present a signifcant threat to plants throughout diferent stages of agricultural production, ultimately resulting in reduced plant yield [2,3].Te consequences of these diseases have far-reaching implications for human dependence on agriculture, encompassing vital necessities such as food, shelter, and clothing.Tis is especially notable in low-income countries [4,5].Jasmine plants, commonly cultivated in coastal regions of Southeast Asia [6], are known to be vulnerable to a range of leaf diseases, including Alternaria leaf blight spot [7].Tis disease exhibits initial signs characterized by yellow patches with dark brown stains surrounded by yellow rings [8].As the disease progresses, the spots grow larger, spreading across a signifcant portion of the leaves, eventually leading to blight.Notably, concentric rings can be observed within the lesions, and the disease also afects the stem, petiole, and fowers [9].Te fnal two stages are critical for disease detection, with the transition from yellow to brown in leaves referred to as the "brown stage."Subsequently, the "fnal stage" involves the maximum coverage of brown spots on the leaf, leading to the plant's fatality.Identifying these crucial stages is essential, as timely action can be taken to address the issue efectively [10].Various datasets, including cassava, tomato, cotton, and tobacco, have been utilized to report several CNN-based approaches for detecting plant leaf diseases [11][12][13][14][15][16].Nevertheless, the research on jasmine plant leaf spot disease detection remains limited.Te scarcity of suitable datasets poses a signifcant challenge in the development of CNN-based detection algorithms capable of detecting various stages of jasmine plant leaf spot disease.In the past, several segmentation and morphological methods have been reported for grapes and other leaves [17][18][19].However, there is a need for a semantic segmentation method specifcally designed to extract the leaf spot features of jasmine plants.
Te contribution of this work is as follows: (1) Tis study introduces a novel leaf image augmentation strategy employing DCGAN, resulting in the generation of an expanded dataset with 10,000 synthetic jasmine plant images.Diverging from conventional methods, our approach exhibits superior scalability and image quality.Comparative analyses underscore the efectiveness of our DCGAN-based augmentation, positioning it as an advanced and impactful contribution in dataset expansion techniques.(2) Our proposed methodology for identifying the "brown stage" and "fnal stage" of leaf spot disease in jasmine leaves introduces an original approach using UNet-based semantic segmentation, specifcally ResUNet with a custom CNN backbone.Outperforming traditional methods, our approach achieves heightened accuracy and efciency.Comparative evaluations highlight its superiority in disease stage recognition, marking it as a signifcant advancement over current identifcation techniques.(3) Tis research explores various semantic segmentation techniques and pretrained CNN backbones for leaf spot identifcation.Te proposed model, boasting an mIoU of 0.95, surpasses alternative segmentation methods, providing a more precise and reliable classifcation of disease stages.Comparative assessments underscore the efectiveness of our model in capturing nuanced details, establishing it as a leading solution in the feld of leaf spot identifcation.
Section 2 discusses recent works proposed for leaf disease detection in the literature.Sections 3 and 4 introduce the proposed model and present the experimentation results.Finally, Section 5 provides the study's conclusion.

Related Works
In recent years, remarkable progress has been achieved in detecting diseases from leaf images.Tese approaches can be broadly classifed into two main groups: traditional detection methods and deep learning-based detection methods.In addition, this section will delve into various augmentation techniques employed to expand the dataset.

Traditional Detection Methods.
A leaf stage recognition system was developed by incorporating K-mean clustering approaches [20] to focus on specifc areas that play a crucial role in leaf disease detection.Geetha et al. [21] proposed four preprocessing steps to reduce noise in the leaf image dataset.Furthermore, Annabel et al. [22] utilized traditional detection techniques, including the K-nearest neighbor (KNN) algorithm, to classify plant leaves based on morphological features such as color, intensity, and size.For color analysis, Narmadha and Arulvadivu [23] reported the conversion of primary leaf colors into LAB color space and employed clustering algorithms.In the work of Gupta et al. [24], an automated strategic removal of the background was performed, and the desired diseased portion was extracted for mildew disease detection from cherry leaves.In addition, Kurmi and Gangwar [25] employed color transformation for seed region identifcation in leaf analysis.Literature [26] describes several methods used in precision agriculture.However, achieving high classifcation accuracy in leaf spot detection has proven to be a challenge for most machinelearning approaches.In this context, various literature studies have explored deep learning methods for leaf morphology identifcation, which will be discussed in the following subsection.

Deep
Learning-Based Detection Methods.Te detection of tomato plant disease through deep learning-based segmentation has been previously explored in the works of Shoaib et al. [27] and Agarwal et al. [14].Another study by Xie et al. [28] proposes a technique utilizing a fully convolutional neural network (FCN) for the segmentation of maize leaf disease.Prior studies have presented deep neural network-based classifcation models for plant diseases.Hridoy et al. [29] employed a deep neural network approach to identify betel leaf diseases.Kaur et al. [30] introduced a semiautomatic CNN model for soybean leaf disease classifcation.Haridasan et al. [31] developed a CNN-based detection model for paddy leaf diseases.Furthermore, Alsabai et al. [32] proposed a hybrid deep learning approach, incorporating improved Salp swarm optimization, for the multiclass detection of grape diseases.Shoaib et al. [33] focused on addressing the challenge of accurately identifying diseased spots amidst complex feld conditions.Tey trained their proposed system using a dataset comprising crop leaf images with both healthy and diseased sections.Te algorithm's performance was evaluated using metrics like accuracy and intersectional union ratio (IoU) to segment lesion regions from the images precisely.In a diferent context, Lin et al. [34] propose a semantic segmentation model that employs convolutional neural networks (CNNs) to recognize and segment powdery mildew in individual pixel-level images of cucumber leaf.Teir approach achieved a joint intersection ratio score of 79.54% and a dice accuracy of 81.5% based on 20 test images.Finally, Soliman et al. [35] presented work that proposed employing deep learning techniques to detect plant lesions by extracting hidden patterns from plant leaf disease.Despite the availability of plant disease datasets such as the PlantVillage dataset [36], the AgriVision collection [37] [41,42].In addition, a combination of rotation and shift was explored to increase the dataset further [43].By utilizing GAN-based augmentation, the dataset's enlargement resulted in a 20% increase in classifcation accuracy [44].Another study employing a detection framework saw an improvement of 7.4% in classifcation accuracy [45].Data augmentation is of utmost importance in efciently enhancing the dataset for detection and classifcation approaches.A novel augmentation method will be detailed in the next section.

Methodology
Tis research focuses on enhancing the detection of disease spots on Jasmine leaves, particularly brown-stage and fnalstage spots that are challenging to identify accurately.To overcome limited data, a GAN-based augmentation model is employed to expand the leaf dataset used for segmentation.
Te study explores the efectiveness of UNet, WUNet, U2Net, and ResUNet architectures in this context while also investigating diferent segmentation backbones to optimize the detection performance, as shown in Figure 1.
3.1.Dataset.In this study, Figure 2 presents image samples depicting diferent stages of diseased leaves.Te dataset for these images was collaboratively developed with experts from Krishi Vigyan Kendra, Karnataka, India, who utilized digital cameras to capture a total of 1000 images.Tese images cover four stages of Alternaria leaf blight spot disease, including 450 images for the brown stage, as illustrated in Figure 2(a), where the blight spot covers a quarter of the leaf, and 550 images for the fnal stage, as depicted in Figure 2(b), which covers a larger area of the leaf with blight spots.To enhance the dataset, generative advisory networkbased augmentation techniques were employed.It is worth mentioning that the early stage of leaf spot disease was not considered in this study.Instead, the focus was on the later stages of brown stage and fnal stage, which are crucial for understanding disease progression.Further details regarding the application of augmentation techniques can be found in the subsequent section.eters signifcantly compared to traditional architectures while preserving the decoder's precision, thereby reducing computational load.Our choice of these semantic segmentation models was driven by specifc strengths: UNet's efciency in preserving structural elements, U2Net's lightweight design for real-time segmentation without compromising precision, WUNet's adaptability to resource constraints, and ResUNet's balance between accuracy and efciency.We conducted experiments to determine the optimal model for jasmine leaf disease detection.Mobile-NetV4 is a big architecture.Its implementation at the encoder part is shown in Figure 4 and detail information is provided in Table 1.Te proposed CNN network uses a novel computation technique called depthwise separable convolution, which bears similarities to traditional convolution but involves a two-stage calculation process.Unlike the conventional approach, where a single convolutional calculation is performed per layer, depthwise separable convolution divides the process into two phases.Te frst stage encompasses a separate convolution operation with a 3 × 3 kernel for each input channel, followed by batch normalization and activation.Tis phase is referred to as depthwise convolution.Te second stage involves further processing the output channels from the depthwise convolution using a 1 × 1 pointwise convolution.Tis pointwise convolution is applied across all depthwise convolution output channels.Overall, depthwise separable convolution signifcantly enhances computational efciency by reducing the computational load.Table 1 presents a comprehensive description, providing details about convolution layers 1 and 2. For clarity, in this study, we denote the depthwise convolution layer as "conv_dw" and the pointwise convolution layer as "conv_pw."Tis process is repeated for layers 3 to 6, and the fnal convolutional layer is identifed as "layer 7." Notably, Table 1 showcases the parameter reduction at each sequential layer, highlighting the achieved computational efciency.

Data Augmentation
UNet is an encoder-decoder model comprising two distinct networks, namely, the contraction network and the expansion network.Te contraction network, referred to as the encoder, is responsible for extracting pertinent features from the leaf image [47].On the other hand, the expansion network, known as the decoder, reconstructs the segmentation map using the encoded features obtained from the encoder [48].Te earlier proposed UNet model is designed with four blocks to extract spatial features from the image.Each block consists of two convolution layers with ReLU activation and a max-pooling layer, downsampling the input by a factor of 2 [49].Te proposed UNet model extends beyond the four blocks and includes additional convolution layers activated by the leaky ReLU activation function.Tese enhancements, along with the custom backbone, contribute to capturing low-level features essential for the leaf spot disease model.Te overall network architecture is shown in Figure 5.

Comparison of Diferent UNet-Based Segmentation
Approaches.In this research, we investigate and compare several UNet-based segmentation architectures, each offering distinctive design characteristics and advantages for leaf spot detection tasks.Te UNet architecture features a symmetric encoder-decoder design, skillfully utilizing skip connections to concatenate feature maps from the encoder with corresponding decoder layers.Tis approach efectively preserves high-resolution information during the decoding process.WUNet, an extension of UNet, is commonly referred to as wide UNet [50].It enhances the architecture by widening the convolutional layers with an increased number of channels.Tis design choice signifcantly improves the model's capture of contextual information, potentially leading to enhanced segmentation performance.U2Net, a recent and specialized architecture, is purposefully tailored for salient object detection.Inspired by UNet, U2Net incorporates several improvements, including additional branches and attention mechanisms.Tese attention modules are crucial in highlighting salient features, rendering U2Net highly suitable for tasks requiring precise boundary detection.On the other hand, ResUNet, also known as residual UNet, is a variant of UNet that integrates residual connections derived from the ResNet architecture.By leveraging these residual connections, the model efciently facilitates gradient fow during training, enabling the efective training of deeper architectures.Tis capability makes ResUNet [51] particularly well-suited for handling more complex segmentation tasks.Trough a comprehensive evaluation and comparison of these U-Net-based models, we aim to gain valuable insights into their individual performance, strengths, and suitability for a diverse range of semantic segmentation challenges.

Assessing the Diferent Backbone Architectures for
Segmentation Models.To assess the semantic functionality, all the models are trained with various pretrained networks, such as ResNet, EfcientNet, VGG16, and VGG19, as backbones.In addition, a custom backbone is utilized for the evaluation.Te backbone models are employed in the encoder part of the various semantic segmentation models such as UNet, WUNet, U2Net, and ResUNet.Initially, a semantic segmentation model with baseline backbones is assessed.Subsequently, one-by-one, pretrained models and custom backbone CNN networks were used to assess the model.For this study, UNet with skip connection is employed as the segmentation model.

Steps Used for Leaf Spot Disease Detection Using the Custom Backbone UNet Framework
(1) Prepare the leaf dataset using DCGAN augmentation (2) Train the chosen segmentation model with input images and corresponding masks, considering performance metrics like mIoU, Dice, and pixel accuracy calculated using equations ( 1)-( 4   for 100 epochs, with 5000 iterations in total.To initiate the process, the initial value of the loss function's λ is set at 0.1 [52].As iterative training progresses, the lambda value is updated to 0.01.Following the iterative training process, these augmented images, along with their corresponding masks, are passed to the segmentation block.Te study employs various models, including UNet, WUNet, U2Net, and ResUNet, all utilizing a 3 × 3 kernel.Each model undergoes 300 training epochs.

Hyperparameter Tuning of Segmentation Models. Te
UNet, WUNet, U2Net, and ResUNet models were trained using various backbones, each with a batch size of 32, over 300 epochs.Here, the batch size determines how many samples are processed before updating the model's parameters, while epochs represent the number of complete passes over the training data.Te learning rate, a critical hyperparameter, was set to 0.0001 to balance learning speed and convergence.Te Adam optimization method was used for model compilation, activating all convolutional layers with the ReLU activation function using a 3 × 3 kernel.An early-stop mechanism based on validation performance was implemented during training to prevent overftting.In our proposed segmentation model with a custom backbone, the Adam optimizer was utilized with a learning rate of 0.001 and a batch size of 32.Te model underwent training for 100 epochs with the ReLU activation function, following an iterative process to determine the optimal parameter settings.

Evaluation Metrics.
To assess the efcacy of the DCGAN augmentation method, we analyze the similarity between the synthesized images and the template images.Tis evaluation utilizes well-established similarity metrics, including the peak signal-to-noise ratio (PSNR) and the structural similarity index (SSIM) [53].Tese metrics ofer valuable insights into the degree of resemblance between the generated images and the target images, enabling a thorough evaluation of DCGAN augmentation model performance.Te segmentation tasks are evaluated based on the calculated metrics, which are determined by the following equations: .

Result and Discussion
Te FID score serves as a widely adopted metric for assessing the fdelity of generated images in relation to real images from a given dataset.In this context, the objective is to train the DCGAN in a manner that ensures the FID score remains stable and lies within the specifed range of 13 to 15, as depicted in Figure 6.Sustaining the FID score within this designated range signifes that the generated images closely mirror the characteristics of the real images in the dataset, showcasing a notable level of visual quality and diversity.
Custom MobileNetV4 Network at the encoder block   images and real images, while the PSNR score refects the level of fdelity and noise present in the generated images.Higher SSIM scores suggest a closer resemblance to real images, whereas higher PSNR scores indicate enhanced image quality.Upon comparing both sets of generated images, it was observed that fnal-stage images attained superior scores in both SSIM and PSNR metrics.Tis implies that they exhibit better similarity and higher quality when compared to real images.Tese fndings ofer valuable insights into the performance of our image generation models and underscore the superior capabilities of the fnal stage in producing high-quality images.Figure 7 visually presents brown-stage-generated images obtained using DCGAN, showcasing the visual aspects of our research outcomes.
In Figure 8, the fnal-stage-generated images produced by DCGAN are displayed.Te analysis reveals that the brown-stage-generated images outperform the fnal-stage images both qualitatively and quantitatively.
Figure 9 shows the segmentation results of four models: UNet, WUNet, U2Net, and ResNet, each equipped with a backbone.Te comparison indicates that the UNet model with the custom backbone outperforms the WUNet, U2Net, and ResUNet models with the same custom backbone regarding segmentation performance.
In Figure 9, we present the training accuracy of several segmentation models, each integrating unique pretrained CNN networks in conjunction with our novel custom backbone.Notably, the UNet segmentation with the custom backbone emerges as particularly efective for detecting leaf spot diseases.Te training accuracies depicted in Figures 10(b) and 10(c) display variations throughout the epochs.Concurrently, Figure 10(a) illustrates the performance of ResUNet, which exhibits similar fuctuations.However, a comparative analysis in Figure 10(d), representing the proposed UNet with the custom backbone, reveals that the latter demonstrates superior performance.Tis suggests that our innovative custom backbone enhances the UNet segmentation model's efcacy in comparison to other confgurations, underscoring its potential for accurate and robust leaf spot disease detection.Further details and insights into these results will be discussed in the subsequent sections.
In Table 2, performance metrics, namely, mean of IOU referred as mIoU and Dice coefcient referred as Dice, are shown for the two-stage leaf disease classifcation employing various backbone CNN networks.Te results demonstrate that the proposed custom backbone combined with the UNet semantic segmentation yields superior outcomes.Tis innovative framework successfully extracts the low-level features for leaf spot disease detection, enhancing the accuracy of the classifcation process.
Table 3 presents the outcomes of the evaluation conducted to determine the most suitable backbone for the   11 ofers a visual depiction of pixel accuracy metrics derived from the comprehensive evaluation of four distinct models: UNet, WUNet, U2Net, and ResUNet.It is noteworthy that each of these models is confgured with its unique backbone architecture.Notably, our proposed UNet for semantic segmentation stands out for its remarkable performance, a feat amplifed by the integration of a custom backbone.In the specifc case of the custom backbone working in tandem with ResUNet, the results are particularly impressive, achieving the highest pixel accuracy recorded at an exceptional 0.98.Tis underscores the efectiveness of the custom backbone in enhancing the segmentation capabilities of ResUNet.To delve deeper into the comparative analysis of pixel accuracy metrics among these models, UNet demonstrated a pixel accuracy of 0.90, WUNet recorded a pixel accuracy of 0.85, and U2Net yielded a pixel accuracy of 0.87.Tese individual outcomes emphasize the superior performance of our proposed ResUNet with a custom backbone, especially when contrasted with other segmentation models explored in this study that employed diverse backbone confgurations.Tis detailed examination of pixel accuracy metrics not only highlights the exemplary performance of the proposed ResUNet but also provides valuable insights into the relative strengths of each model.

Brownstage
Finalstage 12 Journal of Computer Networks and Communications Te integration of a custom backbone, particularly in conjunction with ResUNet, emerges as a pivotal factor in achieving outstanding pixel accuracy.
Figure 12 provides a detailed view of the confusion matrix associated with four distinct models: UNet, WUNet, U2Net, and ResUNet.Each of these models utilizes diverse backbones to predict both the brown and fnal stages of leaf disease.A standout observation is the remarkable performance achieved by our proposed backbone in conjunction with ResUNet, resulting in an impressive prediction accuracy of 95%.
Tis outstanding accuracy underscores the efcacy of our proposed backbone when integrated with ResUNet, showcasing its capability to accurately predict both brown and fnal stages of leaf disease.Te synergy between the custom backbone and ResUNet evidently contributes to superior predictive outcomes.
In conclusion, the results presented in Figure 12 afrm the excellence of our proposed approach.95% prediction accuracy demonstrates the practical success of our model in efectively handling the complexity of leaf disease prediction.Tis achievement not only highlights the advancements made in the feld but also serves as a testament to the potential impact of innovative backbone confgurations in enhancing the overall performance of segmentation models.Te combination of a well-designed backbone with ResUNet stands out as a key factor in achieving this commendable accuracy.

Conclusion
In conclusion, this paper introduces a groundbreaking segmentation approach for efectively detecting leaf spot disease.Te study employs various baseline models (UNet, WUNet, U2Net, and ResUNet), each integrated with distinct pretrained CNN network backbones in the encoder path, leading to signifcant improvements in segmentation efciency.One of the key contributions of this research is the proposal of a custom backbone specifcally tailored for UNet segmentation, which demonstrated exceptional accuracy in precisely delineating spots associated with both brown-stage and fnal-stage leaf spot diseases.In addition, the study explores the efcacy of DCGAN-based augmentation, a semantic and efcient process that successfully generates 10,000 images (5,000 images for each type).Tis augmentation technique signifcantly enriches the dataset, resulting in notable performance enhancements for the segmentation models.Specifcally, our proposed DCGAN augmentation achieved an impressive SSIM score of 0.90 ± 0.172 and a PSNR score of 25 ± 2.4.Te proposed approach exhibits remarkable potential in advancing leaf spot disease detection and has practical implications for agricultural research and applications.Te study's promising results underscore the importance of employing efcient segmentation techniques and augmentations to elevate the accuracy and reliability of disease classifcation processes.Furthermore, the integration of the custom backbone has proven to be particularly benefcial, enabling the detection model to capture low-level features of brown spots of varying sizes with an impressive mIoU of 0.95.Tis customized backbone can be implemented in lightweight networks suitable for mobile-based applications.Despite the heightened computational complexity and extended training time associated with the larger and more diverse dataset, our deployed segmentation model exhibited improved efciency.Initially, segmentation results were suboptimal, with mIoU and Dice scores falling below the 0.5 range.However, substantial enhancements were observed post-augmentation, with reported mIoU reaching 0.91 and Dice reaching 0.96, underscoring the efectiveness of DCGAN augmentation in refning segmentation accuracy and consistency.In addition, our proposed segmentation model efectively handled the larger and more complex dataset, achieving a pixel accuracy of 0.96 and achieving efcient segmentation.It is worth noting that the augmentation process was introduced to address data-related efciency challenges, yet our segmentation model proved capable of managing these complexities adeptly.Overall, the fndings of this research open up new avenues for more efective leaf spot disease detection, ofering valuable insights into the application of efcient segmentation methods and augmentations in the feld of agriculture.With continued development and implementation, the proposed approach has the potential to make a signifcant impact on crop disease management and contribute to the advancement of agricultural practices.

) ( 3 )
Train and fne-tune the segmentation model parameters.(4) Iteratively train the model until achieving a satisfactory training and validation accuracy curve, otherwise, repeat Step 3 (5) Deploy the model for testing on a real image test set (6) Output the segmentation results to identify the brown stage and fnal stage of the leaf Te overall fowchart of the leaf spot disease detection is illustrated in Figure 4.

3. 4 .
Training Details.Te segmentation task involves using an augmentation model to generate a total of 5000 images for each type.During the training process, the loss function L adv of the DCGAN is fne-tuned.Te GAN model is trained

Figure 5 :Figure 6 :
Figure 5: Proposed UNet segmentation model with a custom backbone.

Figure 7 :
Figure 7: Sample images obtained using DCGAN image augmentation on brown-stage images.

Figure 8 :Figure 9 :
Figure 8: Sample images obtained using DCGAN image augmentation on fnal-stage images.

Figure 12 :
Figure 12: Confusion matrix of models UNet, WUNet, U2Net, and ResUNet, each utilizing distinct backbones for the prediction of brown and fnal stages of leaf disease.
Te DCGAN's conditional input allows the generator to produce synthetic samples based on specifed conditions.Convolutional neural networks (CNNs) are widely adopted in GANs, particularly for image processing, delivering remarkable results in various computer vision tasks.Te generator takes a compressed representation of the training image set, consisting of 1000 images, and generates new images with a resolution of 256 × 256 pixels in RGB format.A 100-dimensional vector with random values between 0 and 1 augments the input image generation process.To achieve the desired resolution for generated images, the generator incorporates convolutional transpose layers, while the discriminator relies on two convolutional layers with 256 neurons each and LeakyReLU activation.Te training process utilizes the SGD optimizer and focuses on minimizing the Ladv loss.Te aim is to prevent the discriminator from accurately distinguishing fake images.During training, [46]g DCGAN.Ian Goodfellow and his colleagues pioneered the creation of DCGAN (deep convolutional generative adversarial network) in 2015[46].theGANmodelaimsfor a Frechet inception distance (FID) score below 15 as a performance measure.Training involves 200 epochs with a batch size of 32, and the similarity between generated and template images is evaluated using the structural similarity index (SSIM) and signal-to-noise ratio (SNR) metrics.Te overall methodology is illustrated in Figure3.3.3.Proposed SegmentationModel for Jasmine Plant Leaf Disease Detection.Segmentation of images is a crucial aspect of computer vision, wherein an image is divided into different regions and assigned specifc class labels to create a map that provides information about each pixel of the image.A custom backbone based on MobileNetV4 is integrated into the UNet-based architectures to detect critical types of leaf spot diseases in jasmine plants.Integrating MobileNetV4 into various UNet frameworks, including UNet, WUNet, U2Net, and ResUNet, involves utilizing it as the encoder component.It replaces conventional convolutional layers, seamlessly integrating its efcient multiscale feature extraction capabilities.Tis reduces model param-

Table 1 :
Overview of the network framework, featuring Mobile-NetV4 large as the backbone architecture.

Table 2 :
Performance metrics for various segmentation models with various backbones.Figure 11: Pixel accuracy metrics of four diferent models UNet, WUNet, U2Net, and ResUNet, each utilizing distinct backbones.