A Decision Support Framework for Automated Screening of Diabetic Retinopathy

The early signs of diabetic retinopathy (DR) are depicted by microaneurysms among other signs. A prompt diagnosis when the disease is at the early stage can help prevent irreversible damages to the diabetic eye. In this paper, we propose a decision support system (DSS) for automated screening of early signs of diabetic retinopathy. Classification schemes for deducing the presence or absence of DR are developed and tested. The detection rule is based on binary-hypothesis testing problem which simplifies the problem to yes/no decisions. An analysis of the performance of the Bayes optimality criteria applied to DR is also presented. The proposed DSS is evaluated on the real-world data. The results suggest that by biasing the classifier towards DR detection, it is possible to make the classifier achieve good sensitivity.


INTRODUCTION
According to the American Diabetes Association, 18.2 million of the American population, which constitutes 6.3% of the total population, have diabetes. In the United States alone, diabetes is responsible for 8% of legal blindness, making it the leading cause of blindness in people between 20 and 74 years of age [1][2][3]. Researchers including the authors of this paper [4] have therefore suggested an automated screening system for diabetic retinopathy for prompt diagnosis. Since the disorders exhibited in the early stage do not affect the vision, detection of the disease right at its onset can be done only if regular eye examination of the diabetic patients is performed.
This paper proposes an automated screening system that would detect early signs of nonproliferative diabetic retinopathy (NPDR). The contributions of this paper are twofold: (a) automated detection methods based on image processing for identifying lesions related to DR and (b) a decision support system (DSS) for automated DR screening. Classification schemes for deducing the presence or absence of DR are developed and tested. A univariate approach has been devised to test the suitability of the classification mechanism with respect to the detection of retinopathy. Classification is performed by test data subjected to unsuper-vised learning. This approach has been developed for one particular feature but the feature space can be extended depending on the number of disorders needed to be detected. The detection rule is developed based on binary-hypothesis testing problem which simplifies the problem to yes/no decisions. An analysis of the performance of the Bayes optimality criteria applied to DR is also presented.
The test data for the classification scheme is composed of real-world retinal images obtained from Lions Eye Research Center at LSU, New Orleans. The data contains retinal images that belong to either background retinopathy, maculopathy, or preproliferative retinopathy. The DSS framework focuses mainly on microaneurysms as these are the early signs of DR and are present at all the stages as the disease progresses from mild to severe NPDR. The DR screening results obtained from the DSS are compared with the physician's diagnosis to measure the system's sensitivity. The results suggest that by biasing the classifier towards DR detection, it is possible to achieve 100% sensitivity, although at reduced specificity of 67%. Since sensitivity implies the presence of abnormality, this biasing towards sensitivity is reasonable.
The organization of the paper is as follows. Section 2 describes the background and related work. Section 3 presents the decision framework adopted for the detection 2 International Journal of Biomedical Imaging of microaneurysms followed by the experimental results and conclusions in Sections 4 and 5, respectively.

BACKGROUND AND RELATED WORK
Diabetic retinopathy is a progressive disease and the condition may advance from mild retinopathy to severe proliferative retinopathy. Diabetes Control and Complications Trial (DCCT) [5] and the UK Prospective Diabetes Study (UKPDS) [6] concluded that blood glucose and blood pressure control can slow down retinopathy.

DR related disorders and stages
Diabetic retinopathy can be broadly classified as nonproliferative diabetic retinopathy (NPDR) and proliferative retinopathy (PDR).The European Association for Study of Diabetic Complications (EASDEC) [7] has characterized the stages of diabetic retinopathy as follows.
(i) Background retinopathy. This condition is often present without any visual impairment and can therefore go unnoticed if dilated eye exam is not undertaken by a diabetic patient after regular intervals. Findings in the retina show microaneurysms and exudates that are enlarged tiny blood vessels (dots) and tiny haemorrhages or leaky areas (blots), respectively, on the surface of the retina. (ii) Maculopathy. Areas of leakage develop in the retina and the retina becomes boggy. The leak can continue to enlarge. The waterlogging can affect the central part of retina, macula, and eventually affect the vision. (iii) Preproliferative retinopathy. A large number of haemorrhages and microaneurysms are exhibited and IRMA along with venous beading is seen. The condition is called "preproliferative" as it usually progresses to proliferative retinopathy, when "new vessels" develop. (iv) Proliferative retinopathy. Proliferative diabetic retinopathty (PDR) has the greatest risk of visual loss. The condition is characterized by the development of new, abnormal vessels (neovascularisation) near the optic nerve and haemorrhages in the vitreous humor and in front of the retina. The neovascular vessels are weak and can bleed into the vitreous humor of the eye leading to permanent complications.
The stages that precede the proliferative retinopathy constitute nonproliferative diabetic retinopathy (NPDR). A prompt diagnosis at the early stage of the diabetic retinopathy can help prevent severe damages to the retina of a diabetic patient.

Automated DR screening system
An automated screening system for diabetic retinopathy consists of the following three stages as described below. The first stage involves image-based feature detection and analysis, that is, identifying the patterns of interest using image processing methods. Image segmentation, edge/boundary detection, shape, and texture analysis are some of the techniques commonly used in image processing for pattern detection purposes. Feature analysis can be carried out on the original image or in the transform domain.
The next stage involves representing the features in the feature space and analyzing the features jointly in order to characterize the image, as a whole, in terms of retinal disorders. The abnormal features detected using the feature analysis provide very useful information such as the location, size, center, and other geometrical aspects of the features. This information needs to be analyzed in the feature space in order to reduce the ambiguities that are common in any image analysis.
The final stage involves developing a classification scheme that classifies the given retinal image based on the abnormalities present in the retinal images and the severity that they exhibit. Confidence levels need to be estimated using statistical methods for all the estimates. Discriminant functions and algorithms for distinguishing images based on the features present in the image need to be developed.
While significant research [8,9] is being carried out in the field of extracting vessels and abnormalities from retinal images, a comprehensive framework for automated screening using statistical framework and feature analysis has not been developed so far in the research community.

Feature detection using image processing techniques
Researchers [10][11][12][13][14] have approached the problem of feature detection in varied ways. A modular system developed in [15] makes use of a large database of images where features have been identified by the physicians. This database is later used to detect similar features in new images. The recognition process employs unsupervised learning mechanism and the classification phase uses supervised learning.
Another technique proposed in [16] automatically detects and distinguishes between different lesions (hard exudates, cotton-wool spots, and haemorrhages) after image enhancement. The image is enhanced by taking the difference between the background illumination and an edge detection operator.
An automated system such as the one proposed in [17,18] can be used to substitute a trained observer to identify and quantify microaneurysms. In this system, a combination of shade correction, matched filters, and shape algorithms are used to detect microaneurysms in fluorescein angiogram.
A comparative microaneurysm digital detection system is described in [10,19,20]. This system registers retinal images of the same eye subjected to a series of studies by focusing on the regions centered around the fovea and provides a comparative result for the count of microaneurysms.
Detection of microaneurysms in digital angiograms of the eye fundus is proposed in [21]. This system determines the count of microaneurysms by first enhancing the retinal image and then employing object segmentation.

PROPOSED DECISION SUPPORT SYSTEM
In this paper, three different classification schemes based on the Bayesian framework, namely, the likelihood ratio test, maximum a posteriori detector, and Bayes detector, are presented. The decision framework considers the following four possible outcomes in order to analyze the performance of the classification schemes. The erroneous classifications are given by false accept (FA) and false reject (FR) which are referred to as type-I and type-II errors [22]. In relation to diabetic retinopathy application, the cost associated with each of these would be governed by the amount of harm caused by a misdiagnosis. The repercussions incurred in categorizing a person affected with diabetic retinopathy as normal are obviously more than the converse. Some statisticians refer to the cost function as the loss function. The cost or the loss associated with any misclassification is directly proportional to the severity induced by the error. Considering a univariate case and analyzing each feature independently, the probability density functions obtained for the case wherein microaneurysms are present and the case where they are absent are shown are Figure 1. The data is composed of 25 normal images and 23 affected images of the retina selected from a large pool of images.
It can be seen that the region of overlap ranges from a feature value of 0.1 to 0.375. The boundary value occurs at feature value of 0.3. At the boundary value, decision can be taken for either case. But the loss function can be such that the loss associated with normal classification is more than that with abnormal classification. Thus, in order to minimize the risk involved in misclassification, all the cases at the boundary value can be classified as abnormal.
The region under the abnormal probability density function ranging from 0.1 to 0.3 corresponds to the false reject rate where an abnormal person is classified as unaffected. And the region under the normal probability density function ranging from 0.375 to 0.4 corresponds to the false accept rate where a normal person is categorized as affected.
The area under the false accept region gives the false alarm probability and was found to be equal to 9.8374e−007. It is these two regions that correspond to the erroneous regions.
The rest of the regions are classified as correct detection wherein there is a clear demarcation between the two hypotheses. The detection probability is given by the area under the abnormal density function excluding the overlap and was found to be 0.7071. The discriminability is directly proportional to the difference in the means and varies inversely with the variance. Therefore, the farther apart the two density functions are and the lesser the variance they exhibit, the greater is the discriminability and the lesser the overlap. If the discriminability is infinity, then the two density functions do not exhibit any overlap and the type-I and type-II errors are zero and the detection is perfect. The discriminability is a quantitative measure of decidability and is independent of the chosen decision criteria. This factor is given as The discriminability for the case shown in Figure 1 was found to be d = 2.7528. When a feature value is presented to the system, then it can be classified either as normal or abnormal. This detection process can be treated as a binaryhypothesis testing problem. The hypotheses supported are the null hypothesis, H 0 (specifies the absence of the microaneurysms), and the alternative hypothesis, H 1 (specifies the presence of the microaneurysms). Each of the hypotheses has a probability density function associated with it. In the case of diabetic retinopathy, each of the two categories, that is, the affected and the unaffected, represents a Gaussian distribution where the feature values tend to a particular value (given by the mean) but with a little variation (given by the variance). For the aforementioned data, maximum likelihood estimation was carried out in order to determine the conditional probability density functions and the problem was reduced to the detection of the following hypotheses: versus where p 0 and p 1 are the density functions for the normal and the affected case, respectively. The variance obtained for the normal and the abnormal case for the data under consideration was found to be the same. The respective values  obtained for these two sample datasets are μ 0 = 0.0475 and μ 1 = 0.5842 for a standard deviation of σ = 0.0832. Since the variance for the two categories is the same and the difference between the means is small, the discriminability is small. Decision rule, δ, for H 0 versus H 1 is a function of the feature and can take either a value of 1 or 0 depending on whether the feature belongs to H 1 or H 0 . Also, if the feature does not belong to H 1 then it belongs to H 0 , that is, H 0 = H c 1 . Therefore, the decision rule can be mathematically represented as where x is the feature. Detection can be performed by employing any optimal decision criteria. This paper considers the following detection methods.

Likelihood ratio test
The conditional pdf, p i (x), gives the likelihood that a feature value x belongs to a particular state of nature H i . The likelihood ratio is given by Detection method that compares the likelihood ratio with a certain threshold value is called the likelihood ratio test. For the Gaussian distributions considered for the two possible classes, the likelihood function is given as Figure 2 shows the likelihood ratio. Since the two distributions are independent of each other, the numerator and the denominator can be interchanged.
A threshold τ is fixed such that decision can be made on τ. Thus, the decision rule becomes The threshold depends on the prior knowledge about each hypothesis and cost or loss associated with each classification. The prior probability does not depend on the feature value. The prior probabilities for diabetic retinopathy can be demographically based on the percentage of population affected by the disease. The sample space consisting of retinal images for our analysis belongs to Louisiana State University Eye Center. The statistics that correspond to Louisiana State University Eye Center imply that 60% of the diabetic population is prone to diabetic retinopathy. Thus, the prior probability for the abnormal case is P(abnormal) = 0.6 and for the normal case would be P(normal) = 1 − P(abnormal) = 0.4. Loss associated with each classification is determined by a loss matrix L, where an element L i j represents the loss associated with choosing a hypothesis H i when H j is true. For a binary-hypothesis testing problem, i and j can only take values of 0 or 1. Also, the range of loss would be from 0 to 1. Zero corresponds to no loss and 1 corresponds to maximum loss. Zero loss would be attributed to the case where detection occurs, that is, type-I (FA) and type-II (FR) errors are absent. As mentioned earlier the loss associated with type-II error should be more than the loss associated with type-I error. In other words loss incurred in classifying an abnormal person as normal given by L 01 should be greater than the converse, that is, L 10 . In our calculations we have made use of the following loss matrix: The threshold τ is given as By substituting as follows: P(normal) = 0.4, P(abnormal) = 0.6, L 11 = L 00 = 0, L 01 = 0.8, and L 10 = 0.3, we get a threshold value of 1.7778. Thus, if the likelihood ratio is more than 1.7778, then the feature should be classified as abnormal, and normal otherwise. Instead of comparing the likelihood ratio with the threshold τ, the feature value can be compared with a threshold τ which can be derived from (6). Threshold τ was found to be Thus, the detector becomes P. Kahai et al. By substituting the appropriate value in the above equation, we get τ = 0.3233. Thus, if a feature value is above 0.3233, then it can be classified as abnormal, and normal otherwise.

Maximum a posteriori detector
The posterior probability, unlike the prior probability, is the conditional probability that a particular state of nature exists for a given feature value and is represented as P(H k | x), where k represents the class index. For better comprehension, we would represent the posterior probability for the normal category as P(normal | x) and for the abnormal case as P(abnormal | x). In general, the posterior probability is obtained by the Bayes formula as follows: where p(x | H k ) is the class-conditional probability or the likelihood, P(H k ) is the prior probability, and c is the total number of possible hypotheses. The posterior probability for the normal case is thus given as follows: where p 0 (x) = p(x | H 0 ). Similarly, the posterior probability for the abnormal case is where Here, p(x) is a scaling factor and is given as Now, the decision rule is based on choosing that hypothesis whose posterior probability is maximum and hence, it is called maximum a posterior estimator. So, the decision rule is represented as Classification using MAP detector is depicted in Table 1. It can be deduced from the table that a retinal image that presents four or more microaneurysms is treated as abnormal by the MAP detector. The final diagnosis rests with the physician and is shown in Section 4.

Bayes detector
The Bayes detector is based on the optimality criterion that minimizes the Bayes risk or average risk r(δ). The Bayes risk is the overall cost incurred by a decision rule δ and it depends on the conditional risk or expected loss. The conditional risk R k (δ) is the average cost incurred by a decision rule when hypothesis H k is true. Suppose that a particular observed feature is classified as H i , whereas the true hypothesis happens to be H j ; then the loss incurred, as shown in the previous section, would be L i j . Because P(H j | x) is the probability that the true state of nature is H j , then the expected loss in classifying the feature as H i would be [23] International Journal of Biomedical Imaging where c is the total number of possible hypotheses. To minimize the overall risk, compute the conditional risk given as above for each possible hypothesis and select that hypothesis for which the risk is minimum. This minimum conditional risk is called the Bayes risk. The average cost incurred by the decision rule when hypothesis H 0 is true, that is, the person under consideration is normal, denoted by R 0 (δ), is Similarly, average cost incurred in classifying a person as abnormal is The Bayes risk is given as The Bayes decision rule is The Bayes detector does not provide any condition at the boundary. Therefore, the boundary feature values at which the probability of classifying into each of the categories is the same can be classified into any category. Specifically, at the boundary we should be able to classify each feature as abnormal so that a person for whom the true hypothesis is abnormal is not misdiagnosed.

EXPERIMENTAL RESULTS
We have made use of 143 retinal images provided by the Louisiana State University Eye Center. Supervised learning was performed for training, whereas unsupervised learning was used to test the system. The system is trained for NPDR.
In our experiments we have compared the retinal images of the diabetic patients which do not manifest microaneurysms with those which do. Moderate-to-severe cases were considered for the case wherein microaneurysms are present. A YES decision (abnormal) corresponds to the presence of microaneurysms for the moderate and severe cases of NPDR and a NO decision (normal) relates to the absence of microaneurysms. Each decision has an associated cost that is represented by the Bayes risk. The results obtained are given in Table 2.
Feature value corresponds to the number of microaneurysms exhibited by the test image. The system classifies the affected retinal image as abnormal, whereas unaffected retinal image is classified as normal. The Bayes risk as shown in (20) is given by the minimum of the two expected losses. From Table 1, it can be deduced that for an image that exhibits less than 3 microaneurysms, the expected loss associated with normal classification is less than that with abnormal classification. Hence a feature value of less than 3 has been categorized as normal which is rational. The threshold provided by the likelihood ratio test, that is, τ = 0.3233, corresponds to a feature value of 4. If the posterior probability of classifying a feature as abnormal is the same as that of normal, then the classification can be made either way. But classifying it as abnormal would reduce the risk involved in misclassification. Thus, if the retinal image presented exhibits 4 or more microaneurysms, it is classified as abnormal. The system classifies a feature value of 4 and 5 as abnormal, whereas the physician's diagnosis is otherwise. This is because of the asymmetric costs associated with each classification. The final decision rests with the physician; therefore, if the system treats a normal image as abnormal the cost incurred is less than that of the converse. The sensitivity of the decision related to the classification of microaneurysms is 100%, while its specificity is 67%. The computational time is mainly dependent on the detection algorithm and it is approximately 10 nanoseconds.

CONCLUSIONS AND FUTURE WORK
This paper proposed a decision support framework for automated screening of DR for the univariate case. This model can be extended to multiple disorders that would include the covariance associated with all the signs of DR. The experiments support the feasibility of a complete automated screening mechanism that includes all the disorders related to DR. The machine can be made adaptable by including Bayesian learning mechanism that would improve the accuracy of the classifier as a new feature value is presented to it by modifying the priors.