Comparing Conventional and Deep Feature Models for Classifying Fundus Photography of Hemorrhages

Diabetic retinopathy is an eye-related pathology creating abnormalities and causing visual impairment, proper treatment of which requires identifying irregularities. This research uses a hemorrhage detection method and compares the classification of conventional and deep features. Especially, the method identifies hemorrhage connected with blood vessels or residing at the retinal border and was reported challenging. Initially, adaptive brightness adjustment and contrast enhancement rectify degraded images. Prospective locations of hemorrhages are estimated by a Gaussian matched filter, entropy thresholding, and morphological operation. Hemorrhages are segmented by a novel technique based on the regional variance of intensities. Features are then extracted by conventional methods and deep models for training support vector machines and the results are evaluated. Evaluation metrics for each model are promising, but findings suggest that comparatively, deep models are more effective than conventional features.


Introduction
Diabetic retinopathy (DR) is a prevalent cause of vision loss among working-age adults. Te statistics of DR patients have been projected to be 191 million by the year 2030 [1]. Initially, its diagnosis is almost impossible due to the absence of distinct symptoms. DR identifcation is crucial in the early phase because its timely treatment and medication may reduce the progression rate by 57% [2], approximately. Terefore, an annual examination is recommended for diabetes patients. Several surveys were conducted which highlighted that diabetes patients refused to have regular checkups because of lack of symptoms, time-consuming diagnostic process, and limited access to ophthalmologists [3]. DR falls into two main categories: nonproliferative diabetic retinopathy (NPDR) and proliferative diabetic retinopathy (PDR). NPDR weakens capillary walls and yields leakage of blood from vessels that compile microaneurysms (MAs). Later, ruptures turn MAs into hemorrhages (HEs). MAs and HEs are often term as red lesions. When the disease progresses, the NPDR turns into PDR and angiogenic factors originate from new blood vessels, called neovascularization.
Eye experts use fuorescein angiography (FA), optical coherence tomography (OCT), and fundus photography for the screening of DR [4]. FA is used to identify locations where blood vessels are closed or ruptured. OCT screening method provides a cross-sectional overview to determine the amount of fuid in retinal tissue and is used to evaluate the efectiveness of the adopted treatment. Next, fundus photography is an easy and immediate screening technique for documentation of DR progression and its improvement over time. Laser treatment, eye injections, or eye surgery can be recommended by an ophthalmologist in a case when DR is intimidating to the eyesight [5]. Laser treatment helps to cure the neovascularization of blood vessels at the back of the eye. It stabilizes the changes that occur because of diabetes. Eye injection is used in the case of PDR to stop the emergence of new blood vessels. Te beneft of this method is the improvement in eyesight. However, steroid injection produces excessive pressure inside the eye that may cause blood clots. Eye surgery is performed on an eye when a massive amount of blood accumulates in the vitreous humour. Te eye specialist removes some jelly-like substance that flls the space back of the eye.
Retinal fundus imaging is preferred for the initial screening phase because of its easy assessment and it is less expensive. Ophthalmologists capture retinal images using a fundus camera with an appropriate feld of view (FOV). Early signs of DR are observed to determine its stage for medical prescription. Contrary to benefts, HEs detection is challenging due to certain impediments. Factors like blurriness and poor illumination may reduce diagnostic accuracy. Uneven lighting conditions may produce dark shades in retinal images, which misleads detection. Blood vessels share intensity characteristics with HEs because of their similar appearance. Sometimes, HEs can be adjoined with blood vessels because they originate from them. Detection of those HEs is imperative for early screening of DR. HEs that reside at the retinal periphery are blended with the black background and are problematic to identify for computeraided automatic detection. Appropriate selection of a deep network for classifying HEs is crucial to obtain promising results. Hence, these constraints cause HEs detection to be a challenging task. Figure 1 shows the characteristics of fundus images.
Te risk of human interpretation necessitates an efcient algorithm that can segment and classify hemorrhages effectively. Te computer-based second interpreter expedites the diagnostic process and assists ophthalmologists in assessment. Te proposed methodology addresses the problems of fundus images. A novel gradient-based adaptive gamma correction adjusts the brightness of fundus images adaptively. An automatic detection scheme is proposed by image calibration. Te proposed smart-window-based adaptive thresholding (SWAT) segments the objects while isolating hemorrhages from blood vessels and the retinal periphery. Objects are classifed based on the intuitive selection of conventional features by manipulating the visual appearance of hemorrhage in retinal fundus images. Te statistical comparison of features for HEs classifcation using conventional and deep models is provided. Tis research study uses various architectures of deep models to analyze which is suitable for HEs classifcation. Identifcation and detection of hemorrhages that resided at the retinal periphery and connected with blood vessels are the hallmarks of the proposed algorithm.

Related Work
N. Figueiredo et al. [6] proposed an algorithm to detect retinal abnormalities at the early stage of DR. Tis technique uses three classifers for detection, including HEs. Novel features based on the inherent properties of lesions are used for classifcation. Tese features are extracted from wavelet bands, Hessian multiscale analysis, variational segmentation, and texture decomposition. Te sensitivity and specifcity of HEs detection are 86% and 90%, respectively. Tang et al. [7] propounded a splat feature classifcation method for HEs detection. Te retinal image is partitioned into nonintersected segments called splats. Te formation of each splat is based on similar color and spatial information. Shape, texture, the intersection of neighboring splats, and flter bank information are used. Later, optimum features are selected using the flter approach. Tis method achieves a 0.96 receiver operating characteristic curve (ROC). Detection of early signs of DR was proposed by Junior and Welfer [8]. Te technique is based on mathematical morphology to remove fovea and blood vessels because they share the intensity characteristics with HEs. Tis approach achieves 87.69% sensitivity and 92.44% specifcity. Te gradual elimination of blood vessel-based HEs detection technique is presented by Zhou et al. [9]. Tis technique deals with the HEs that are attached to the blood vessels by segmenting the dark regions, retinal vasculature, and HEs candidates. A binary image is manipulated further for providing good vascular connectivity and then removed gradually. A support vector machine (SVM) is trained using 49 features to classify candidates into non-HEs and HEs. Te technique benchmarks promising results for two datasets. Karkuzhali and Manimegalai detect retinal abnormalities to classify fundus images into various DR stages [10]. Median flter, shade correction, Gaussian, and modifed Kirsch flter are used to suppress noise and quality enhancement in the preprocessing stage. Te image is divided into nonoverlapping patches of similar gray information. Te Super-pixel method is applied to obtain the uneven grids. Te gradient magnitude with toboggan segmentation is used for HEs segmentation. Feature vector and classifer mark images into various stages of DR.
Te automatic segmentation of retinal lesions is presented by Tan et al. [11] using a novel single convolutional neural network (CNN). Te proposed CNN model consists of 10 layers that classify retinal lesions simultaneously. Te technique normalizes input images before network training. Te proposed CNN model marks 0.6257 sensitivity on a large dataset. Another automatic detection of retinal lesions is proposed by Lam et al. [12]. Te technique uses 1,324 image patches for the training of the deep network. Te sliding window method considers all the patches from the testing image to generate the probability map. Tis CNN model provides promising results for each type of lesion. A deep learning approach was propounded by Islam et al. [13] for the detection and grading of DR. Te technique focuses on early DR detection using a novel CNN network. Te method is tested on a publicly available Kaggle dataset and reports a 0.851 quadratic weighted kappa score and 0.844 area under the curve. Te technique for the detection of red lesions using the You Look Only Once (YOLO-V3) algorithm is proposed by Pal et al. [14]. Te contrast of the green channel is enhanced and then the bounding boxes of red lesions are obtained using the YOLO algorithm. Detection is performed using Darknet53, and logistic regression provides the confdence level of an object. Te model is trained using Adam optimization and tested for red lesion detection. Objectness threshold is employed to reduce the false predictions. Tis technique scores 83.33% of the average precision. A synergy deep learning model is presented by Shankar et al. [15] to classify fundus images into DR stages.

2
Journal of Healthcare Engineering Tis technique removes noise from the edges in the preprocessing stage. Ten, histogram-based segmentation obtains regions for the classifcation. Synergy deep learning model classifes images into severity levels. Te algorithm is benchmarked on the Messidor dataset which shows promising results.

Method
Tis section provides a detailed explanation of the detection scheme. Figure 2 shows the steps of the propounded HEs detection scheme. First, the image is preprocessed to enhance the quality. Ten the prospective hemorrhage candidates are estimated. Te objects are segmented using smart window-based adaptive thresholding. Finally, the objects are classifed into hemorrhage and nonhemorrhage classes using features.

Dataset Description.
Te algorithm is trained and tested on the DIARETDB1 dataset [16]. Te dataset contains 89 fundus images, of which fve images are normal and the rest have various DR pathological symptoms. Tese images are captured by the 50-degree feld of view using a fundus camera under diferent illumination conditions.

Preprocessing.
Few images of the DIARETDB1 dataset have good brightness levels and contrast, while the majority of them are dark with low contrast. Te quality of fundus images is enhanced using contrast limited adaptive histogram equalization (CLAHE) [17], gradient-based adaptive brightness adjustment (GAGC) [18], and nonlinear unsharp masking [19]. CLAHE enhances contrast and reduces the efects of over-saturation by clipping intensity peaks. Our GAGC utilizes Sobel gradient information. Gamma correction [20] is applied using the adjusted threshold value of the Sobel operator. HEs can be attached to blood vessels and can only be separated when their regions are clearly defned. Terefore, fuzzy logic-based image sharpening using a nonlinear flter is employed to sharpen the image. Tis method determines a fuzzy relationship between central and adjacent pixels in a 3 × 3 window. Sharpening flters work efciently, but they introduce the noise in the image. Te nonlinear property sharpens images and produces less noise than linear flters. Te result of the preprocessing stage is provided in Figure 3(b).

Seed Points Extraction.
Te detection process can be time-consuming if the entire image is considered for the search operation during detection. A good approach is to obtain prospective locations of objects to be detected and eliminate redundant information. Tis approach expedites detection with high accuracy. A similar technique manipulates the intensity profle of HEs in our work. HEs are dark objects surrounded by bright regions and share intensity characteristics with blood vessels and dark shades. Tis property suggests an inverted Gaussian matched flter [21], whose intensity values are low at the center and grow gradually beyond the center. Tis flter enhances HEs and blood vessels due to high correlation and yields low response wherever applied to the rest of the image, and is depicted in Figure 4(a). Te redundant information is further reduced using the thresholding method. It depicted from the matched-fltered image that low and high responses are close to each other. Terefore, entropy thresholding is employed [22] that eliminates unrequired information efciently. Tis thresholding method fnds cross entropies between quadrants of gray level co-occurrence matric (GLCM). Te optimum threshold value from the gray range is selected successively, which minimizes the objective function. Figure 4(b) is a sample image of cross-entropy thresholding.
Elimination of blood vessels may also remove some of the HEs attached to them. Terefore, consideration of objects that correspond to blood vessels is imperative. Te morphological opening is applied to break the vasculature structure. Tis maneuver provides seed points for all types of HEs, including those that are attached to the blood vessels. Conversely, it increases the number of seed points for subsequent segmentation and classifcation stages and can be depicted in Figure 4(c).

Image Calibration.
Te HEs can be present at a jelly-like surface called the vitreous humour, and the black Journal of Healthcare Engineering 3 background does not contribute to the detection phase. A black background is darker than HEs and misleads the detection process. Terefore, it impedes the automatic detection of those HEs that reside at the retinal border. Te black background is illuminated for efective and automatic detection. First, a median flter is applied on a green channel to suppress intensity variation in the background and then binarized. Te resultant image is called the retinal mask that highlights the retinal area. Later, an eroded mask is subtracted from the retinal mask to get the retinal boundary. Calibrated image is obtained by adding an enhanced green channel, complemented retinal mask, and retinal border. A sample of the calibrated image is depicted in Figure 5 and is used for segmenting HEs.

Smart Window-Based Adaptive Tresholding Segmentation.
A segmentation method is sensitive to the dissimilarity of objects and their surroundings, and dissimilarity can be in terms of intensities or textures. Tere are two challenges for segmenting HEs. First is a segmentation of HEs blended with the black background and located at the retinal rim. Tis background has been illuminated using image calibration. Te second is a segmentation of HEs that are attached to blood vessels. Blood vessels and HEs share intensity characteristics and they are known as dark smooth regions. Terefore, a novel smart window-based adaptive thresholding (SWAT) is proposed that isolates HEs from blood vessels. Tis method is adaptive and segments HEs encompassed by various bright regions. A search space is defned for automatic detection to constrain segmentation within image range. Complemented binary mask, obtained in the previous section, is expanded 80 pixels wide to provide sufcient space for HEs residing at the retinal border.
Segmentation using a threshold value is obtained by maximizing inter-region variance from image histogram [23]. Tis method determines the weighted variance σ 2 B (j) between regions for a given threshold value j as where μ T is mean value of an original image, ω z is the total probability of individual region z, and μ z is the mean value of individual regions in R � 2, 3, . . . , 19, 20 { } after thresholding. An optimum threshold value is taken successively by maximizing the inter-region variance as Te efectiveness η of an optimum threshold τ * depends upon a selection of an appropriate number of regions from R. An appropriate number of regions provides maximum efectiveness. η is a ratio of weighted variance σ 2 B (τ * ) to the total variance σ 2 T of an image that can be calculated as SWAT initiates from seed points to segment retinal structures from the calibrated image, Figure 5. Te search process starts from the bounding box of a seed point. Te calibrated image is cropped using the vertices V � v 1 , v 2 , v 3 , andv 4 } of a seed point. Te cropped window W 1 (x, y) is thresholded iteratively until the appropriate number of regions from R is selected as where ϑ is a vector that contains R − 1 threshold values. (4) provides robustness in accordance with the regional diversity of HEs and foregrounds. In the case of bright foreground, fewer iterations are required to approach the stopping criteria that yield few numbers of regions. For the dark foreground, more iterations are required to reach η ≥ 0.8, which requires more numbers of regions to perform efective segmentation. HEs are dark objects surrounded by various bright regions. Te window is thresholded as where min(ϑ) is the minimum threshold value of the vector ϑ. Tere is a possibility that a window may have many HEs or dark objects after thresholding, so priority is given to the biggest ones because they are more dangerous for eyesight than the smaller HEs. Terefore, two large objects are kept and the rest are removed based on their area. Tis maneuver is applied such dark shades, often bigger sizes than HEs, cannot mislead the segmentation and actual HEs can be retained within the window. Furthermore, an object closer to the center of the window is more likely a HE than the other one. Tis probability criterion is proposed because seed points are extracted using the matched flter that models the intensity characteristic of HEs. Terefore, the object is eliminated using distance transform except one with minimum distance from the center of the window. Te distance d i of the i th object from the center W 2 (x c , y c ) of the window is calculated using where I i (x) and I i (y) denote the x and y spatial locations of i th object, respectively, and i � 1, 2 { }. Te sizes of the HEs are bigger than the size of the window because they initiated from a seed point. Te window must be expanded to capture the complete HEs using Q � [q 1 , q 2 , q 3 , q 4 ] contains information of border pixels. Binary variables q 1 , q 2 , q 3 , andq 4 correspond to left, top, right, and bottom border pixels, respectively. If all these variables are 0, then no further iteration is required because it shows the complete segmentation of the object. If any variable in Q has a value of 1, it guarantees that the size of an object is bigger than the size of the window towards a particular direction. Te window is expanded using equation (7) and the calibrated image is cropped by the updated vector V.
Te search space assists in performing segmentation automatically. Some of the seed points are redundant and belong to blood vessels and dark shades. A window may go beyond image range when segmenting blood vessels or dark shades. Te condition on vector S in (7) determines whether vertices of vector V lie within search space. Windows containing HEs and non-HEs objects are classifed using features in the next section.  [24], ResNet50 [25], and AlexNet [26]. Four SVMs trained using conventional features and deep features to classify objects into HEs and non-HEs categories. Conventional features manipulate the visual appearance of HEs. For instance, HEs have sharp edges than macula, known as central vision. So, Laplacian-gradient features diferentiate HEs from the macula. Blood vessels are line-shaped objects and HEs are comparatively circular objects. Terefore, connected component descriptors are useful to classify them. Color features help to distinguish dark shades from HEs. Opened or closed object's contour, number of corner points, and the spatial distance from the corners to the object's center are hand-crafted features. Hence, connected component [27], texture [28], color [29], and hand-crafted features are extracted to train SVM. While the VGG16, ResNet50, and AlexNet CNN models provide deep features for SVMs training.

Results, Comparison, and Discussion
Te fndings of the propounded detection scheme are reported in this section. Illustrations of performance metrics and the statistical comparison of various deep models are presented. Te results can be pictorially be depicted in Figure 6.

Data Preparation and Evaluation Metrics.
Te DIA-RETDB1 dataset is employed to detect HEs and compare various feature extraction models. Tis dataset is divided into training and testing subsets. Te training subset is further separated into training and validation subsets. Windows obtained by the SWAT segmentation were annotated using ground truths. Twenty images are used to benchmark the performances of classifers. Te classifcation results are compared using sensitivity (SE) and specifcity (SP) [30].
where true-positive (TP) and true-negative (TN) are the truly predicted measurements by the classifer. TP is the rate of truly classifed hemorrhage, while TN is the correct prediction rate of the negative class. Conversely, falsepositive (FP) and false-negative (FN) are the measurements of the false predictions of the classifer. FP wrongly indicates that an object belongs to a hemorrhage, but actually, it does not. FN shows that hemorrhage is not present while the window contains a hemorrhage.

Results.
Results represent that the false-negative (FN) rate of conventional features is higher than deep features. It states that conventional methods cannot identify some of the HEs. Conversely, deep models are more capable of HEs identifcation. Te classifcation results of deep and conventional models are provided in Table 1, while visually they are depicted in Figure 6.

Discussion
Te SE and SP observe the performances of the classifcation models. All the features' extraction models show promising results and are applicable for the detection of DR. However, the FN rate of the SVM trained by the conventional features is the highest, resulting in a minimum SE than the other methods. Te reason is obvious, the small arteries of the blood vessels. Blood vessels are classifed using connected components, but the small arteries also share these properties. Terefore, their similar appearance concerning the intensity and connected component characteristics misleads the classifer because they are labeled as a negative class. Furthermore, It is observed from the architectures of the predefned networks that the flter sizes of the frst convolution layers of VGG16, ResNet50, and AlexNet are 3 × 3, 7 × 7, and 11 × 11, respectively. Te small flter's size is appropriate for HEs classifcation because of its homogeneous property. HEs are regarded as dark smooth regions. VGG16 has the smallest flter size and identifes more HEs Journal of Healthcare Engineering because its FN rate is the lowest among other deep models. Conversely, dark shades and small blood vessels mislead VGG16, resulting in the highest FP rate.
Te analysis of the classifcation results recommends the deep model for the HEs classifcation. Te reason could be that some conventional features may not be efective and mislead the classifer. Te deep networks provide relevant features because they learn incrementally from the data. Terefore, no feature can mislead the classifcation stage. On the contrary, the CNN models take more time to learn from the windows and recognize them. Tey often need large numbers of training examples, depending upon the complexity of the data, for better performance. Te conventional method needs comparatively less time and training examples to obtain the statistical features.

Conclusion
Tis research presents an automatic detection technique to compare various deep-learning-based models with the conventional features extraction approach. Te method frst enhances the quality of the fundus images for a better appearance of pathological symptoms in the preprocessing stage. Ten, the locations of the hemorrhages are estimated using seed points extraction that expedites the detection process. Deep and conventional features classify the objects into hemorrhages and nonhemorrhages. Te research concept emerged from the problem highlighted by the research community that two types of hemorrhages are challenging to detect. First, the hemorrhages that are associated with the blood vessels. Second, the hemorrhages that are located at the retinal border. Our detection scheme is suitable for all types, including those hemorrhages that reside in the vitreous humour. Tis study also prescribes that the deep features can better classify hemorrhages than the conventional methods; hence they are more efcient and suitable for the hemorrhages classifcation. Te assessment of performance metrics of deep modalities reveals that a shallow network produces competitive results compared to deep models. An intense deep network may not yield signifcant improvements but increases training time. In this study, AlexNet shows promising results despite the shallowest network. Terefore, a suitable network with its appropriate parameters is critical.
Te research work's intuition is to present a fully-automated scheme for reducing the misdetection rate of hemorrhages by ophthalmologists interpreting fundus photographs. Te method identifes hemorrhages in an interactive way that is easy to interpret for diabetic retinopathy diagnosis. Furthermore, the locations of hemorrhages are highlighted, which might help the clinicians conclude the severity levels of the disease.

Data Availability
Publicly available datasets were analyzed in this study. Tis data can be found as follows: https://www.it.lut.f/project/ imageret/diaretdb1/index.html.

Conflicts of Interest
All authors in this study declare no conficts of interest.

Authors' Contributions
TA performed the investigation, methodology, software implementation, writing the original draft, conceptualization, and formal analysis. CC was involved in funding acquisition, project administration, supervision, review, and editing. All authors have read and agreed to the published version of the manuscript.