Multiloss Function Based Deep Convolutional Neural Network for Segmentation of Retinal Vasculature into Arterioles and Venules

The arterioles and venules (AV) classification of retinal vasculature is considered as the first step in the development of an automated system for analysing the vasculature biomarker association with disease prognosis. Most of the existing AV classification methods depend on the accurate segmentation of retinal blood vessels. Moreover, the unavailability of large-scale annotated data is a major hindrance in the application of deep learning techniques for AV classification. This paper presents an encoder-decoder based fully convolutional neural network for classification of retinal vasculature into arterioles and venules, without requiring the preliminary step of vessel segmentation. An optimized multiloss function is used to learn the pixel-wise and segment-wise retinal vessel labels. The proposed method is trained and evaluated on DRIVE, AVRDB, and a newly created AV classification dataset; and it attains 96%, 98%, and 97% accuracy, respectively. The new AV classification dataset is comprised of 700 annotated retinal images, which will offer the researchers a benchmark to compare their AV classification results.


Introduction
Arteries and veins are major components in the retinal fundus images, detecting early changes in those vessels are clear signs of detecting severity levels of the eye diseases [1,2]. That can be diagnosed using the abnormalities in shape, size, and other morphological attributes of retinal vasculature [3]. Hence the study of tortuosity, appearance, shape, and further morphological attributes of human retinal blood vessels can be the important diagnostic indicator of several ophthalmic and system diseases that embraces diabetic retinopathy, hypertensive retinopathy, arteriolar narrowing, arteriosclerosis, and age-related macular degeneration [4].
The irregularities in retinal vasculature association by cardiovascular disease have been described in the studies [5][6][7]. The effect of systemic and ophthalmic disease on arterioles and venules is very much different. For instance, generalized arteriolar narrowing is one of the early signatures of hypertensive retinopathy. The decrease in Arteriole to Venule Ratio (AVR) is a famous predictor of stroke as well as other cardiovascular diseases in later life. Moreover, arteriovenous (AV) nicking is allied with long-term hypertension risk [5].
The advancement in retinal image acquisition and the availability of retinal fundus images made it possible to run large population-based screening programs to examine the early biomarkers of these diseases. Besides improving the diagnostic efficiency, the computerized retinal image analysis can help in reducing the workload of ophthalmologists. Therefore, an efficient algorithm for classification of retinal vasculature into the constituent venules and arterioles is an essential part of the automated diagnostic retinal image analysis system.
The arterioles/venules in the retinal images look much similar to each other with only very few known discriminating features [5]. The venules appear to be a little bit wider than the arterioles particularly in the place closer to the optic disc. The arterioles exhibit clearer and wider center light reflex as compared to the venules. The venules appear to be a bit darker in color than arterioles. Moreover, generally the arterioles do not cross other arterioles, and venules do not cross other venules within the retinal vasculature tree. The intra/interimage variability in color, contrast, and illumination are further added to existing challenges in developing an automated AV classification system. The widths, as well as the color of retinal vessels, change across their length as they have originated from optic disc and spread in the retina. The color change is due to the variability in the oxygenation level.
Deep learning [8] is gaining importance in the last few years due to the ability to efficiently solve complex nonlinear classification problems. The main advantage of deep learning is the automated feature learning from raw data. The convolutional neural network (CNN) [9] architectures have been utilized for a diversity of image classification and detection tasks with human level performance. The CNNs have been used to identify diabetic retinopathy in retinal images in recent Kaggle competition which gave very encouraging results. The promising results of CNN based architectures in retinal image analysis motivated the researchers to investigate the application of deep learning for pixel level classification and labelling.
Semantic segmentation of an image [10] alludes to the recognition of the image at the pixel level, and every pixel in the image will be classified or attributed to a given class. For the artery/vein classification problem, the target is to classify each pixel in the fundus image to one of the three classes, i.e., the artery, vein, or background.
In this paper, the authors have improved the work in [11], by proposing an optimized deep CNN based design, for classifying the retinal image pixels into arterioles and venules. The proposed technique can perform end-to-end training and classification of the vessels directly without the need for performing vessel segmentation as outlining the vessel centerlines [12] as proposed in different methods [13][14][15][16]. The optimization is achieved by (1) adding the segmentwise contextual judgment to the loss function that optimizes the vessel classification, (2) applying the optimized deep learning method on our newly prepared AV classification dataset to be published publicly after necessary paper work, and (3) improving the accuracy of the labels annotated as thin vessels. To the best of our knowledge, the deep learningbased pixel level semantic segmentation has been utilized for the first time in classifying retinal blood vessels into arterioles/venules. The proposed AV classification algorithm has a potential for replacing the AV classification module in QUARTZ retinal image analysis software tool [17], developed by our research group for quantification of retinal vessel morphology. QUARTZ aims to enable epidemiologists to analyse the association of retinal vessel morphometric properties [18] with the prognosis of various systemic/ophthalmic disease biomarkers [6].
This paper is arranged as follows: a brief evaluation of the related techniques is elucidated in Section 2. Sections 3 and 4 provide a comprehensive depiction of the proposed methodology and the new labelled dataset. In Section 5, the experimental outcomes are presented, followed by discussions and conclusion in Sections 6 and 7.

Related Work
A number of methods are available in the existing literature for classifying the retinal vasculature into arteries and veins. These methodologies might be classified into two major approaches: the feature-based and the graph-based approaches [19].
The feature-based approaches generate for each pixel a set of features that are used as input in an AV classification method. The initial step in most of these methods is the segmentation of the vessel's vasculature, followed by skeletonization of these vessels. The next step is crossovers and bifurcations identification.
The entire vasculature is partitioned into segments of the vessel by excluding the pixels at the bifurcation/crossover points from skeleton images. The features got captured from the segments, and then they are classified by a specific classifier to arteries or veins.
The graph-based methods usually model the vasculature tree as a graph planner. The contextual properties of the graph components are used to formalize the local decisions for each pixel whether it is arteriole or venule. Li [20] presented a Gaussian filter design to recognize the vessel's centerline reflex and utilized Least Mahalanobis Distance, but the accuracy of the classification is calculated at arteriole/ venule level, instead of pixel level.
Grisan [21] proposed segregating the retinal fundus image to four quadrants considering that each one of the segregated quadrants has a minimum of one artery/vein and subsequently performed fuzzy clustering method. Saez [22] and Vazquez [23] enhanced the quadrant based methodology, processed pixel-wise features from HSL and RGB color specifications, and used K-Mean method for AV classification. Kondarmann [24] proposed normalization for this background and then calculated the features of each pixel in the vessel centerline within a neighbourhood of 40 pixels and employed neural network for AV classification. Niemijar et al. [25] have processed a feature vector of the 27 dimensions for every pixel and characterized the vasculature sections utilizing a linear discriminating method for classification. Fraz [26] presented a feature vector at various levels (segment, pixel, and profile based) and performed the ensemble method for classifying the pixels. Relan [27] processed the feature vector from a circle-shaped neighbourhood around each pixel within a particular radius and classified the pixels using least square SVM. Xu [3] constructed a creative list of capabilities from the texture first-and second-order derivatives and forwarded it to KNN clustering. Rothaus et al. [2] and Dashtbuzorg et al. [28] have designed planner graph from vessel skeletons such that the crossovers and branches in the vascular tree are mapped to the graph nodes while the graph links represent the vessel segments. Rothaus et al. also have generated vessel graph, manually initialized some vessel segments, and propagated the vessel labels across the graph using a rule-based method. Dashtbozorg et al. [28] have merged the graph-based methodology with the supervised pixel classification method to achieve the classification at the pixel level. A 30-dimensional vector of color informationbased features is computed for individual centerline pixel and then linear discriminant classifier was applied. These classification results are combined with graph labelling to improve the results. Global likelihood method is used by Estrada [29] to assign the links to the respective a/v labels.
The feature and graph-based methods may suffer in situations where the vessel vasculature tree is not segmented accurately. Moreover, these methods are deeply relying on manually designed features. Welikala et al. [30] have used for the first time deep learning for solving the problem of artery/vein classification. A convolutional neural network of six layers is used for learning the features in vasculature tree. The approach achieved significant results in terms of performance; however, it is also relying on accurate vessels segmentation of the retinal image. Sufian and Fraz [11] have proposed an end-to-end pixel level AV classification technique based on encoder-decoder based on fully convolutional neural network. This technique does not rely on the segmented vasculature; rather it learns and classifies the pixels directly from the image; however, the method is applied in a private dataset of 100 images only.
Normally the researchers go for selecting publicly available datasets or use their prepared ones that are suitable for their research purpose, whether it is vessel's segmentation, AV classification, or the analysis of the vessel's morphology. Public databases are very critical tools for the research community; they provide the needed data to develop and test new methods and enable comparisons with other approaches [31,32].
Al Diri has published a set of (40) AV classification labels [35] as an extension to DRIVE dataset as this dataset is relatively small and may lead to inconsistent, contradictory comparison and produces subjective results among the approaches when used for deep learning approaches. The labelled gold standard scarcity leads the researchers to have no choice other than classifying a region of interest (ROI) in the retina as in [24,25,27,42] or to benchmark their results using only selective data (distinguishable arteries and veins) as in [24] and in [43][44][45] to segment the exudates in the retinal background as in [46][47][48] to localise and segment the optic disk.
The lack of large labelled retinal fundus images dataset became an immediate necessity to be a benchmarking reference for the research community in AV classification, vessel segmentation, and related fields, especially for the introduced deep learning solutions that need a lot of data for more accurate learned models.
In this work, we have improved the deep learning network structure by introducing the contextual segment level judgment to the loss function, in addition to proposing a new labelled dataset composed of 700 labelled images for AV classification and vessel segmentation. The newly introduced large dataset is suitable for becoming a benchmark reference for AV classification and vessel segmentation purposes especially with the rise and success of the deep learning methods.

The Methodology
In this work, a multiloss function optimized deep encoderdecoder was designed based on entire convolutional neural system design for semantic segmentation of retinal vessels and simultaneously attaining the grouping of arteries and veins. The proposed optimized deep learning architecture takes motivation from SegNet [49] and achieved semantic classification of vessels to arteries and veins by assigning to each pixel of the retinal image a class label (arteriole, venule, or background pixel), without performing vessel segmentation separately, which normally had been a prerequisite step in the conventional computer vision artery/vein classification approaches.
The deep learning result that the authors have presented in [11] is optimized and improved in this work, by adding the segment-wise contextual judgment to the loss function that is targeting better optimization of vessel classification. The AV classification dataset is introduced as a large dataset that contains gold standard labels of AV classification and vessel segmentation. The dataset is built according to a systematic process as an extension and at the top of MESSIDOR and EPIC Norfolk. The optimized deep CNN is trained and evaluated on a large newly prepared AV classification dataset of 700 retinal images that helped to optimize the learned model and its results.

The Deep Network
Architecture. The architecture consists of a sequence of encoder-decoder pairs which are used to create feature maps followed by pixel-wise classification.
Encoder-decoder architecture is demonstrated in Figures  1(b) and 1(c). The complete network comprises three layers of encoder-decoder blocks as shown in Figure 2. The input encoder block and output decoder block are presented in Figures 2(a), 2(b), and 2(c), respectively; Figures 1(d) and 2(d) illustrate the new enhancement that redirects the pixelwise result to be adjusted by segment-wise majority vote judgment that optimizes the loss function. Without any fully connected layers, the network is consisting of convolutional layers which are typically established at the end of the traditional CNN. The encoder-decoder based fully convolutional neural network considers the arbitrary size input and given output accordingly sized. The feature learning and inference are achieved as a whole-image-at-a-time basis by back-propagation and dense feedforward computation.
The encoder part of the network takes an input image and generates a high-dimensional feature vector by learning the   features at multiple abstractions and aggregating the features at multiple levels. The decoder part of the network takes a high-dimensional feature vector and generates a semantic segmentation mask. The building blocks of the network are convolutional layers, downsampling, and upsampling. The learning is performed within subsampled layers using stride convolutions and max pooling. The upsampling layers in the network enable pixel-wise prediction by applying unpooling and deconvolutions.

The Multiloss Function.
The pixel-wise loss is the traditional way to compare the results of the segmented image with the ground truth label. It is achieved by comparing the pixel from the segmented image to the same pixel in the ground truth and building the confusion matrix and performing the calculation of desired performance measures.
In arterioles and venules (AV) classification problem the pixel-wise loss calculation is having an issue due to the severe inconsistent thickness of the resulting thin vessels; the thin vessels in the labelled image are defined as the vessels that are less than four-pixel width. The impact of this issue on the results is high since the thin arteries and veins in the retina represent 77% of the retinal vessels, and the inconsistent thickness of those thin arteries and veins deviate the results as well as causing wrong judgment for some pixels that belong to arteries as vein pixels and vice versa.
To overcome this problem, we have improved our deep learning published approach used in [11] for AV classification by including the segment-wise loss to the model that increases the accurate results and reduces the bias in sensitivity and specificity inspired from the skeletal method in [50,51]. New steps are added to optimize the AV classification results. Instead of using a pixel-to-pixel similarity measure, each skeleton segment in the reference skeleton map is assigned adaptively with the color that achieves optimal loss calculation. The result will be determined by selecting the class color generated from the SoftMax function or the color of the class achieved by the majority of the pixels in the skeleton segment. The pixels that exist in the same targeted segment which has the skeleton map got the color judgment of the segment skeleton map color. This enhanced loss calculation hence improves the results and fixes the few artery segment pixels that are wrongly judged as vein pixels, and conversely, it fixes the few vein pixels that are judged as arteries in deep learning.

Learning Details.
The methodology is evaluated on the AV classification dataset that contains 700 images detailed in Section 4, such that 90% of the images are used for training, and 10% of the images are used for testing.
The available pertained models which include AlexNet, VGG, and ResNet are trained on PASCAL VOC [52] or [53]. These datasets are very much different from that of retinal images. Therefore, the pretrained weights are not used. The stochastic gradient descent (SGD) is used to train all the network. The learning rate fixed at 0.1 and a mini-batch of 12 images are used for training. The used approach is adding an extra step for calculating the multiloss function judgment for each pixel and taking the optimal loss value as shown in Figure 1 and (1)- (8). The newly prepared AV classification dataset is used to measure and benchmark the deep learning encoder/decoder performance; then it is tested on DRIVE and AVRDB [54] public datasets.
The challenge is to achieve the segmentation of the three classes (arteries, veins, and background). The RGB images resolution is 2000×2002. Normalization of the local contrast [55] is performed for the original images. Following SegNet semantic segmentation in [49] and using median class frequency balancing in training introduced in [49], the weights of the decoder were initialized using "MSRA" technique for the weight initialization. We followed each deep learning layer by a ReLU nonlinear unit. The learning rate is set to 0.1, and stochastic gradient descent (SGD) is used to train all the variants where each epoch starts after shuffling the training set. Each mini-batch (4 mages) is then fetched to ensure the one time use of each image. The objective function used for the training is the network crossentropy [56]. The loss is summed up over all the pixels in a mini-batch. The training is continued till convergence and reaching the optimal training loss. The model with the highest performance results on the validation dataset is selected.
To calculate the segment level loss, we start with the binary image and generate the vessels vasculature tree using [54]; then we proceed in skeletonization using morphological processing to generate entire vessels skeleton Js; then we identify the bifurcation and intersection points that are then omitted to generate the set of vessel segments. S={Sj : where J = 1, N}, and N is the number for segments in the retinal image, considering that we have used a hyperparameter Maxlength to represent the maximum length for the segment that is defined as the summation of all deviations of all the segment pixels from the maximum diameter (Tp).
Vj = The vessel pixels in segment contaning the skeleton segment S j Manual labelling of arteries and veins Step 1 Segregation of arteries and veins in two different images using Vampire annotation tool Segregation of arteries and veins in two different images using image labeller tool Step 2

Annotating
The labels

Labelling
Step 3 Verification of the generated labels Comparing the annotated labels with the original image to identify discrepancies and fixing it.
Step 5 Expert Validation Labels are validated by two ophthalmologists Step 4 Ground truth generation Generate the ground truth from the annotation Figure 3: AV classification labels, as well as vessel segmentation, labels creation methodology.
The same is done for the predicted image results where equivalently we get J' , S' , V' , and T V j to optimize the vessel color; every vessel segment Vj in Js is allocated to a specific range Rj that is defined as the minimum radius that guarantees the maximum overlap between Vj and V j to discover corresponding pixels color in the predicted image to perform the judgment. By manipulating the predicted image during the process of training, we produce a twofold guide by performing a threshold of 0.5 (to be specific every pixel in the predicted image will be classified as artery or vein or by checking its probability value with the threshold 0.5). Then, for every vessel segment V j in J, the color of all the pixels in the predicted image located within the searching range R j forms a vessel segment denoted as Vs j . The thickness inconsistency of the vessel or the pixel color inconsistency between Vs j and V j is measured by defining the mismatch ratio (MisR) as To measure the segment level loss, we construct a weight matrix defined as w p where each pixel (p) is assigned a weight w p and used to calculate the loss value of predicting the color of the pixel P predicted in the predicted image and is compared to the same pixel P label in the original label we calculated a weighted loss: This is compared with the probability value of the pixel-wise loss p to calculate the segment-loss that builds the segment level loss judgment and adaptively builds the matrix for artery and vein loss probabilities. P represents the artery level loss while P V represents the vein level loss where both contribute to creating the multiloss function judgment for the vessel pixel as in (8); it leads to assigning the pixel class color based on the segment level loss if the pixel-wise loss shows prediction deviation in some of the segment pixels.

AV Classification Dataset Preparation
A plan is prepared to generate a new dataset of 700 AV classification labels for retinal fundus images for AV classification supervised problems and another set of 700 AVvessel segmentation labels for retinal fundus images for vessel segmentation problem. The methodology follows the guidelines approach and specified methodology (Figure 3). and the morphological features of the vascular system of the retina [57]. Below is the rules list, used through the manual labelling systematic process.

Vessel Labelling
(1) The arteries are brighter than the veins, usually thinner than neighbouring veins, and have their central reflex usually brighter/wider than veins [24].
(2) When a vessel divides into two branch segments, the width of the parent vessel is greater than the width of any of its branches, and the angle between the parent vessel and the branch's segment is not greater than 90 degrees [58].
(4) The vessels, crossing each other, must be from the opposite class.

Labelling
Methodology. For vessel labelling, we measured the original color images, the green channel of the images, and the grey-scaled preprocessed version images. The vesseltype labelled set for the vessels is created according to the process shown in Figure 3. The vessel's labelling is performed in a systematic process, which starts with manual marking, according to a set of physiological characteristics of the retina. Then annotation is performed using image labeller using a preprocessed version of the original images, by two computer vision specialists and an ophthalmologist. The process of validation includes counter validation by specialists in computer vision and reviews verification by an ophthalmologist for the class of the vessels. We started in the first stage by the manual marking of the vessels on a preprocessed image. The preprocessing aim is to increase the discrimination between arteries, veins, and the background color to have an enhanced version of the image. The preprocessing is achieved by applying morphological processing, normalization, and intensity averaging for each RGB layer and then mapping the intensity values in the grey scale of each layer to new values, by saturating the bottom 2% and the top 3% of all pixel values. This operation increased the contrast of the output preprocessed image. The vessels labelling methodology is completed in the following steps: (1) The line operator filter method is utilized to attain an image vessels enhanced version. The preprocessed image is used for marking artery as "a" and vein as "v".
(2) The vessels are manually labelled by an expert using the artery vein known distinguishing features explained in Section 4.1; then the vessels are annotated by two experts using image labeller application available with Matlab as well as the VAMPIRE vessel segmentation tool [59].
(3) Validation activity is then performed by observers who collaborate during the labelling process to review and finalize the annotation type of entire segments of the vessel. The label is finally generated from the reviewed annotations.
(4) The next step in the labels preparation is the generation of the AV classification labels as well as the vessel segmentation labels from the original images and the annotations.
Two ophthalmologists verify all the labels at St. Georges University of London, UK. The validation step ensures the complete agreement between the reviewers and annotators; in case there is a comment, the reviewers report it to the annotators for fixing and resending again till the complete agreement is achieved; this also offers reliable AV classification labels dataset for the research scholars in the field.

Labels Enhancement.
The manual labelling and annotation of the vessels are achieved by noticing the continuity of the vessel's type throughout the vessels branching, referring to the physiological characteristics of retinal vessels, and the observed attributes of the vessels following the labelling method detailed in the previous section. After completing the manual preparation of the labels in Figure 3, we have applied the following steps to enhance the prepared labels: (1) Extracting the vessel tree of the original images in the black and white image [60] (2) Combining the vessel tree results above with the manually prepared label to enhance the labels edges  (e) visualizes the vessel thickness inconstancies between the vessel segmentation and the manual annotation enhancement to improve the labels; some images are suffering from poor contrast, illumination, or some pathologies. Hence it affects the manual annotation process that impacts the annotated vessel widths when compared to normal segmented width.

Materials.
The methodology is evaluated on the new AV classification dataset that is prepared to be suitable for deep learning purposes especially for artery/vein classification and vessel segmentation as it contains labels for both types of problems. The dataset is created by preparing three new types of labels for each image; the first type is created for artery/vein classification purposes while the second and third are created for vessel segmentation purposes. The original images are a subset of images from EPIC and MESIDOR datasets. Seven hundred labels are created for each type, two hundred of them are created for images from EPIC dataset, and the five hundred are created for MESSIDOR dataset. The colored  AV classification labels are used from created AV classification dataset to implement the deep learning optimized method.

The AV Classification Dataset.
The labels creation methodology is implemented to generate three types of labels (colored AV classification labels, colored vessel segmentation labels, and black and white vessel segmentation labels).
The images and their labels are of 2002 X 2000-pixel sizes and the first experiment was done and AV classification deep learning problem was also handled in this paper. The high-level dataset hierarchy is illustrated in Figure 6 that contains two subfolders; one for EPIC 200 images and the other for MESSIDOR 500 images, and each image appears in the subfolders (original, AV classification labels, and vessel segmentation labels containing subfolder for the colored labels and another for the black and white label types).
The AV classification colored label Figure 7(a) consists of arteries (red), veins (blue), and background (yellow). The vessel segmentation labels are of two formats. The first format is in Figure 7(b) that is a standard black and white binary image.
The AV classification colored label Figure 7(a) consists of arteries (red), veins (blue), and background (yellow). The vessel segmentation labels are of two formats. The first format is in Figure 7(b) that is a standard black and white binary image. The second vessel segmentation format is in Figure 7(c) that is a colored image where vessels are in "blue" color, and the background is in "yellow" color.
For each image two types of labels are prepared: one for AV classification (check Figure 8 for the prepared MESSI-DOR and EPIC AV classification labels) and the other for vessel segmentation with image resolution 2002x2000. [31] dataset is a consistent set which consists of fundus images utilized for benchmarking of classification and vessel segmentation effectiveness. These images were taken in the Netherland for making the screening the diabetic retinopathy. Within the age group of 25-90 years, data of a total number of 400 diabetic subjects is composed. All the images were compacted with JPEG. For capturing these images, Canon CR5 3CCD nonmydriatic camera is used. Each image's resolution is 768 by 584 pixels having 8 bits per pixel. In each training and test dataset, a set of 20 images are presented. The dataset comprises manually organized ground truth images and masks which are made by two specialists. [54] is developed to be publicly available for hypertensive retinopathy detection. It contains 100 images that are captured via TOPCON TRC-NW8. The dataset annotated labels are prepared in coordination with AFIO, Pakistan expert ophthalmologists. The vasculature tree is classified into venular and arteriolar. The 100 images' size is 1504×1000, and it contains retinal veins, arteries, and AVR, with the vascular, arteriole, and venule branching and mapping of artery and vein network on original fundus image. Table 1 are implemented to quantify the performance of the proposed method.

Optimized Multiloss Function Judgment.
The proposed improved segment-wise judgment has enhanced the AV classification pixel-wise results. The pixel-wise judgment is adjusted by comparing it with the contextual color of the adjacent pixels in the segment (see Table 2).

Metric Description
Mean Accuracy the ratio of the correctly classified pixels averaged over the classes  Table 4 shows that the performance is higher when applying the step of multiloss judgment in the FCN method. The proposed method is tested in the AV classification dataset and showed 97.03% while by testing it with the datasets of DRIVE and AVRDB it showed 96.07% and 98.13%, respectively.

Visualization of Results.
We have run the experiments twice on the 700 prepared images. The first time is with the FCN and then the experiment is repeated using the improvement of segment-wise loss calculation. This approach enhanced the results as shown in Table 3.     Table 4 and the consequent output results in Figures  9-11 are showing the performance output images before and after adding the improvement of segment-wise contextual judgment for AV classification results. Figure 9 shows the classification outcomes of the proposed methodology on AV classification dataset. The first column corresponds to the retinal image. The second column belongs to the ground truth. The third column represents the AV classification performance results when the pixel classification is generated. And the fourth column is the output after performing the segment-wise optimization for the generated pixel-wise results.
The sample images are from the newly prepared AV classification dataset. The semantic representation of the label colors is as follows: the background is marked with yellow color, and the arterioles and venules are marked with red and blue color, respectively. Figures 10 and 11 show the AV classification results on DRIVE and AVRDB datasets. The first column shows the retinal image. The second column illustrates the ground truths and the third column demonstrates the semantic segmentation label. The fourth column represents the AV classification results where the pixel classification is generated, and the fifth column is the output after performing the segment-wise optimization for the generated pixel-wise results. The semantic representation of the label colors is as follows: the background is marked with yellow color, and the arterioles and venules are marked with red and blue color, respectively. Figures 9-11 are showing the benefits achieved from applying multiloss function. Example of how the segmentwise loss judgment enhanced the segmentation results is shown in Figure 12.

Comparison with Other Methods.
The proposed approach's performance is compared along with the previously established methodologies in AV classification. The proposed optimized method has achieved better results in deep learning of classifying arteries from veins in the retinal images. The comparison of the algorithm accuracy with the previously published algorithms is shown in Table 4. We notice a clear observation in Table 4 about the use of several private datasets by several researchers, and the rest have used DRIVE dataset that contains only 40 images. Moreover, most of the introduced AV classification methods have used pixel-to-pixel similarity check, to measure the performance of their approaches and to compare it with previous works in the field; such metric is affected by the observer-to-observer  vitiations in the vessel location and thickness; hence it adds some issues to benchmarking or comparison judgment. For example, (1) they are different in the used dataset, (2) the methods of measuring the performance are not the same, (3) the datasets used are relatively small, and finally, (4) the sizes of the used datasets are insufficient, especially for deep learning approaches. Hence we have proposed using new large dataset and applying the segment-wise judgment to achieve the standard comparison between the methods and also to improve the reliability, robustness, and credibility of the proposed solutions. The proposed method achieved 97.03% accuracy on the AV classification dataset and showed 95.5% and 98.5% accuracy on DRIVE and AVRDB datasets, respectively.

Discussion
Most of the introduced AV classification methods have used pixel-to-pixel similarity check to measure the performance of their methods in comparison with the previous works in the field. This method suffers from observer-to-observer variations in the vessel location and thickness and the use of small/private datasets that are not suitable for deep learning approaches.
The AV classification researchers community lacks the existence and the availability of the public large AV classification image set to assess the performance of a proposed method in terms of reliability and robustness. This technique improved FCN deep learning results from 93.5% in [11] to 97.03%; in addition, it is used in enhancing the new AV classification labels (Figure 8). In this work, the newly introduced and used AV classification dataset contains standard labels created at the top of MESSIDOR and EPIC datasets, and the introduced and used the dataset in deep learning has reported a 97% accuracy for AV classification when compared to the previous works. It is achieved by applying the deep learning on a larger dataset and enhancing the result through the application of the segment-wise loss judgment. Therefore, the results are more credible. The vessels thickness inconsistency in the AV classification dataset labels is penalized using the contextual segment level loss judgment (see Section 4.3: Labels Enhancement).
The AV classification dataset that is used in this work will be released publicly for the MESSIDOR images. The research community can use this dataset in their experiments in the problems of AV classification or vessel segmentation. The AV classification or vessel segmentation labels are available in black and white or semantic colored format.

Conclusion
The medical imaging research field lacks the availability of a large labelled retinal images dataset for assessing the AV classification method and comparing it with other researchers' methods, by benchmarking with central reference large dataset for medical images.
In this paper, an improved result is introduced for a novel encoder-decoder based fully convolutional deep network for robust AV classification of retinal blood vessels. An optimized, proposed segment-wise judgment technique for improving the AV classification results is presented. The use of multiloss function and utilizing the segment level contextual judgment have enhanced the AV classification accuracy results from 93.5% to 97% in this work. A new AV classification dataset is prepared as an extension to MESSIDOR and EPIC Norfolk datasets. It contains 700 fundus images for AV classification and the same and vessel segmentation.
This new AV classification/vessel segmentation labelled retinal images database and the proposed multiloss function judgment technique allow more accurate evaluation and benchmarking with the previous methods in terms of reliability, robustness, and performance. The three types of labels in the AV classification dataset of MESSIDOR images will be released publicly so that the research community can use it in their experiments. The AV classification database will be available at http://vision.seecs.edu.pk/dlav/ or by emailing the authors; we welcome opinions about their experience of using this dataset to keep the AV classification dataset updated.
In future, we aim to extend this methodology to be used in place of the current AV classification module in the QUARTZ software, which is developed by the research group for automated quantification of the retinal vessel morphometry, with the aim of studying associations between vessel change and systemic/ophthalmic disease prognosis. Furthermore, we aim to use the proposed methodology as a preliminary step in the development of the modules in QUARTZ for the identification of venules beading and the measurement of arteriovenous nicking.

Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
The authors declare that they have no conflicts of interest.