Choroid Segmentation of Retinal OCT Images Based on CNN Classifier and l2-lq Fitter

Optical coherence tomography (OCT) is a noninvasive cross-sectional imaging technology used to examine the retinal structure and pathology of the eye. Evaluating the thickness of the choroid using OCT images is of great interests for clinicians and researchers to monitor the choroidal thickness in many ocular diseases for diagnosis and management. However, manual segmentation and thickness profiling of choroid are time-consuming which lead to low efficiency in analyzing a large quantity of OCT images for swift treatment of patients. In this paper, an automatic segmentation approach based on convolutional neural network (CNN) classifier and l2-lq (0 < q < 1) fitter is presented to identify boundaries of the choroid and to generate thickness profile of the choroid from retinal OCT images. The method of detecting inner choroidal surface is motivated by its biological characteristics after light reflection, while the outer chorioscleral interface segmentation is transferred into a classification and fitting problem. The proposed method is tested in a data set of clinically obtained retinal OCT images with ground-truth marked by clinicians. Our numerical results demonstrate the effectiveness of the proposed approach to achieve stable and clinically accurate autosegmentation of the choroid.


Introduction
Choroid is the vascular layer located between retina and sclera. Its inner surface is connected with the retinal pigment epithelium (RPE) through Bruch's membrane (BM), and the outer surface is connected with the sclera. Recent researches indicated that the changes of the choroidal thickness could be related to some ocular conditions such as macular degeneration and myopia [1][2][3][4]. Therefore, segmentation and the ability to accurately measure the thickness of the choroid are of clinical importance.
Optical coherence tomography (OCT) is a technique for obtaining subsurface images of translucent or opaque materials with high resolution [5,6]. It uses low-coherence inter-ferometry and imaging reflections from interior tissues to generate cross-sectional images. Comparing with traditional imaging methods, OCT has some obvious advantages of being nondestructive, high resolution, and minimally invasive, and it has been widely applied in ocular detections for many years [3,7,8]. However, the imaging quality of the choroid is not good enough in retinal OCT images due to the shortage of penetration depth [9]. The major challenges of the choroidal segmentation are from low contrast of the lower boundary and unknown noise in the images, which will make the detection result inaccurate and unreliable.
To segment the choroid efficiently, many researchers studied model-based methods with prior assumptions for the structure of the input images: A-scan [10,11], active contour [12][13][14][15], sparse high order potentials [16], and 2D/3D graph [17][18][19][20][21][22] methods. The main limitation of these traditional approaches is their high dependence on the feature extraction phase for the accurate segmentation. However, extraction of appropriate image features is difficult for a definite medical image recognition problem, and the traditional methods may provide disappointing segmentation results. In recent years, machine learning-based methods have achieved excellent performance in computer vision and medical image analysis. Convolutional neural network (CNN) is one of the extensive application approaches for image processing and also effective for multilayer segmentation in OCT images. Sui et al. used a graph-searching-based segmentation technique with learning an optimal graph weight by using CNN architecture [23]. Masood et al. proposed a two-stage segmentation method to segment out the BM and choroid and calculate the thickness map [24]. A series of morphological operations were used to segment BM, while the choroid was segmented using CNN. Fang et al. presented a novel framework combining CNN and graph search methods (CNN-GS) to segment nine layer boundaries on retinal OCT images [25]. Alonso-Caneiro et al. used a fully-convolutional network (FCN) technique based on graph search theory to segment the choroidal boundary and obtain the choroidal thickness profile from OCT images [26]. There are also some other outstanding networks for OCT image segmentation such as U-shape convolutional network (U-Net), which is considered as the most widely applicable architecture for medical image segmentation [27,28].
It is worth noting that a major disadvantage of many machine learning-based methods, such as CNN-GS, FCN-GS, and U-net, is their reliance on the availability of a large supervised/marked data set. However, marking medical images manually requires highly professional technique, which leads to the lack of accurately labeled data in great quantity. In addition, the number of negative samples is far more than the number of positive samples, i.e., the pixels within a single OCT image are largely labeled as "0" (not contained in the Choroidal-Scleral interface boundary) as opposed to being positively labeled as "1" (contained in the Choroidal-Scleral interface boundary), which may affect the training results. In extreme cases, the loss of negative samples will dominate in the training process and may lead to a high accuracy even when the model predicts all the samples to be negative. Taking these into considerations, we propose an improved CNN model-based method which performs well on a small data set and reduces the adverse impact of unbalanced samples. With the choroid boundaries obtained by neural network, we further adopt a l 2 -l q (0 < q < 1) regression model to fit the choroidal layer curve. This model ensures not only the fitting accuracy but also the simplicity of fitting function, leading to a better generalization segmentation result.
This paper is organized as follows. In Section 2, we describe the details of the proposed method including segmentations of the BM and Choroidal-Scleral interface. Experimental evaluation and comparison with other methods are discussed in Section 3. Concluding remarks are given in Section 4.

Materials and Methods
2.1. OCT Data. OCT images were obtained in Chinese schoolchildren aged 8-13 years using spectral domain OCT (SD-OCT, Spectralis HRA+OCT, Heidelberg Engineering, Germany) in Optometry Research Clinic of The Hong Kong Polytechnic University. Consents were obtained from both children and their parents/guardians. The study protocol has been approved by the Human Subjects Ethics Subcommittee of The Hong Kong Polytechnic University and met the tenets of the Declaration of Helsinki. Cross-sectional OCT images with axial resolution of 3.9 μm and transverse resolution of 14 μm were obtained using a light source (peak wavelength of 870 nm) together with a scanning speed of 40000 A-scan/sec. In order to better capture the boundary of choroid, an enhanced depth image scanning mode was adopted. Choroid was then manually segmented using the built-in software in SD-OCT by trained clinicians. There are 146 marked retinal OCT images from 108 patients in total, and we divide them into a training set of 70 images, while the rest are categorized as in the testing set. In our experiments, all the OCT images are preprocessed as grey images with the size of 150 × 600 (height × width) pixels.

2.2.
Overview. The proposed approach is divided into two parts: BM segmentation and Choroidal-Scleral interface segmentation. Physiological tissues mentioned in this work are visualized by diverse colors in Figure 1, and red curves mark the known ground-truth provided by clinicians. For BM segmentation, we start by recognizing the approximate position of RPE, which is the brightest layer in retinal OCT images. We then identify the BM by its physiological feature in the vicinity of RPE. After the extraction of the BM, we consider to segment the Choroidal-Scleral interface by a two-stage process: (1) Partition the OCT image into small patches and input them into the CNN based classifier. The likelihood of the Choroidal-Scleral interface passing through the patch increases with the predicted value's proximity to 1.
(2) Generate a heat map according to the predicted values of patches. Choose appropriate points in the heat map and fit these points to obtain the curve of the Choroidal-Scleral interface.
The details will be discussed in the later sections.
Our numerical experiments in this paper are implemented in Tensorflow 1.13.1, Python 3.6.6, and Cuda 9.0, running on a server with 2 Tesla P100-PCIE GPU with 16 GB memory at 1. Here, we adopt the cubic regression method in consideration of the simplicity of the RPE curve and the high accuracy of extracted points.

Bruch's Membrane Segmentation.
From Figure 1, we can observe that the BM is at the boundary between RPE and the choroid, and below the curve C RPE extracted in the previous step. Therefore, we intend to find appropriate points Q i , i = 1, ⋯, 600 in the lower neighborhood of C RPE , where Q i is with the largest difference in magnitude in the ith column.
In our experiments, the set fQ i , i = 1,⋯,600g is obtained in the region between C RPE and C RPE−5 ; C RPE−5 represents the curve whose point in each column is 5 pixels lower than the corresponding point in C RPE . The width of gap between two curves is decided by the observation that the thickness of RPE is about 5 pixels in OCT images. Finally, the curve acquired by fitting points fQ i , i = 1,⋯,600g is regarded as the BM. Some results of BM segmentation are shown in Figure 2; we can find that our result (green curve) and ground-truth of BM (red curve) coincide with each other. The error for each image is calculated by where BMðiÞ and d BMðiÞ represent the corresponding numbers of row for ith column in our result and ground-truth of BM, respectively. The average error and variance for 76 test images are 1.5189 and 1.1325.

Choroidal-Scleral Interface Segmentation.
In this section, we present the details of the method based on CNN classifier and l 2 -l q (0 < q < 1) fitter to segment the Choroidal-Scleral interface.
2.5.1. Data Preprocessing and Clipping. In order to improve the recognition accuracy of the lower choroidal boundary, we first remove the irrelevant information in the OCT images above the BM. A large size of patch may lead to a small number of samples, while a small size of patch will result in the lack of features in samples. In our proposed approach, we cut images from the training set into a group of square patches with 32 × 32 pixels from left to right, top to bottom with a step-size of 8.
For each 32 × 32 minipatch, if the marked ground-truth passes through its 6 × 6 center area, this minipatch will be labeled as "1" (positive sample) and as "0" (negative sample) otherwise. As shown in Figure 3, the minipatch in red is labeled as "1," and the minipatch in yellow is labeled as "0." The blue frame in the center is the recognition area with a size of 6 × 6 pixels. In [24], the criterion to define positive samples is the marked ground-truth passes through the 32 × 32 patch, otherwise is a negative sample. Thus, the number of positive samples made in [24] is more than ours, and the ratio of positive and negative samples is relatively balanced. However, the positive samples whose edges are passed through by the marked ground-truth may impact the extraction of features in training process. Therefore, we set the 6 × 6 recognition area in our samples to avoid this problem. After finishing the labeling, the layer segmentation problem is converted into a binary classification problem.

CNN Training.
The structure of the neural network used in our work is the Lenet-5 model [29]. It includes 3 convolutional layers and 3 fully connected layers, which is shown in Table 1. Each convolutional layer consists of a layer of convolution and function of local response normalization. The purpose of 3 continuous convolutional layers is to extract features and map the original data into a feature space. The kernel size in each convolution layer is 3 × 3. After the process of feature extraction, 3 fully connected layers are arranged to provide a classification.
Before applying this CNN model to segment the choroid, it is necessary to pretrain it by using training data. After data preprocessing and clipping, we can get 260000 patches with label as a revised training set. Among these patches, about 24000 samples are on-line samples with label 1, while the other 236000 patches are off-line samples with label 0. Note that the ratio of positive and negative samples is about 1 : 11 due to the way of making samples and specialty of images. As mentioned earlier, unbalanced samples will have a negative effect on the training result, and it is necessary to provide a reasonable solution. For clear display, we use f θ CNN : ℝ 32×32 → ð0, 1Þ to represent the CNN model; then, the relationship between the input patches img i and the output values p i can be denoted as where img i is the ith patch of our input and θ is the set of where l i is the label of the corresponding patch, α ∈ ð0, 1Þ ensures a higher weight for the rare online samples, and γ ≥ 0 provides a higher weight for samples hard to be classified. The Focal Loss, a dynamically scaled cross-entropy loss function, is proposed for dealing with samples' imbalance in [30]. From formula (3), we can find that the value of loss function is reducing with a higher p i , which means the class with lower accuracy will dominate in network training.
Next, we need to minimize FLðp i , l i Þ. Let g i ðθÞ = f θ CNN ði mg i Þ, and the problem can be formulated as arg min where I denotes the total number of input patches. To prevent overfitting in CNN, Dropout and l 2 regularization have been used [31][32][33]. In our numerical experiments, we add l 2 regularization in the objective function to prevent overfitting; hence, the regularized problem is written as Adaptive Moment Estimation (Adam) optimizer [34] is applied to solve problem (5).
After finishing the training of the CNN model, we then apply it to segment choroid. Let M ∈ ℝ 150×600 be a matrix for the storage of output values corresponding to the image matrix. At the first stage, we need to implement the following operations: (1) Cut the input OCT image into a group of patches with size of 32 × 32 from left to right, top to bottom with a step-size 3. Every patch matches a submatrix of M with a size of 6 × 6, (2) Input these patches into the trained CNN model one by one and obtain the corresponding predicted value, denoted by p i for the ith patch, i = 1, ⋯, 7600. All the values of elements in the corresponding submatrix are p i , The results of the above process are shown in Figure 5, which is a special example with a lot of noises and mistaken points. Hence, we need to further recorrect the data.

Data Recorrection.
From the above results, we notice that the major area of the choroid can be accurately detected. However, there may be some misjudgments because of the interference information in the image, and these errors will make a negative influence on the segmentation result if we directly fit all the sample points without filter. For this reason, the random sample consensus (RANSAC) algorithm [35] is applied to recorrect the detected points. RANSAC algorithm estimates parameters of a mathematical model from a set of observed data that contains "outliers" by iterations. Basic assumptions of RANSAC are in the following: (i) Data consist of "inliers", i.e., the distribution of data can be explained by some set of model parameters.
(ii) "Outliers" do not fit the model.
(iii) Other data are regarded as noise data.
For RANSAC, it is also assumed that there exists a process of estimating model parameters for a set of given "inliers," and this model can optimally explain or fit the data.
In addition, a shape constraint rule is enforced to help identifying the outlines, namely, the location of the choroid is always below the BM.

l 2 − l q
Fitter. This step sketches out the outer boundary of the choroid with recorrected data. Through the observation of the ground truth, a high-order polynomial gðxÞ = β n x n + β n−1 x n−1 ⋯ +β 1 x + β 0 is applied for curve fitting to improve accuracy. Moreover, to avoid the fitting function being too complicated for stability, we preferably set a relatively sparse group of coefficients fβ n ,⋯,β 1 , β 0 g. In recent years, l q ð0 < q < 1Þ regularization attaches attention and has advantages over smooth, convex regularization for variable selection. Motivated by this, a regression model consists of a l 2 data fitting term and a l q regularization term is used to ensure the sparsity of polynomial coefficients in this paper. The l 2 − l q regression model is as follows: where fðx i , y i Þ, i = 1,⋯,mg (m ≤ 600) denotes the set of recorrected coordinate points, β = ðβ n , β n−1 ,⋯,β 1 , β 0 Þ T ∈ ℝ n+1 is the vector consists of the fitting polynomial coefficients, ∥β ∥ q q = ∑ n i=0 jβ i j q , and λ > 0 is a parameter. Some existing methods such as iteratively reweighted l 1 (IRL1) and l 2 (IRL2) minimization algorithms have been widely studied for solving problem (6), see [36][37][38][39] and references therein. We adopt the hybrid orthogonal matching pursuit-smoothing gradient (OMP-SG) based on lower Step = 8 Pixels

On-line
Off-line CSI passes through center area?

Computational and Mathematical Methods in Medicine
bound theory in [36] to solve problem (6). First, we use the OMP method to generate an initial point β 0 and its support set; then, the SG method is employed to further reduce the objective value of (6), and finally, the numerical solution is purified by deleting its entries with small values. The framework of the hybrid OMP-SG algorithm is given in Algorithm 1, where jΛj denotes the number of elements in the set Λ.
We refer the interested readers to [36] for more details on the OMP-SG algorithm. Moreover, the smoothing function used in the SG method is where ξ > 0 is a smoothing parameter. It is easy to verify that e ϕ is continuously differentiable for any fixed ξ > 0.
Theorem 1. Let β * be a local minimizer of problem (6) satisfying ϕðβ * Þ ≤ ϕðβ 0 Þ for an arbitrarily given point β 0 , 1,⋯,ng: Moreover, the number of nonzero entries in β * is bounded by kβ * k 0 ≤ min ðm, ðϕðβ 0 Þ/λL q ÞÞ: From the upper bound in Theorem 1, the number of nonzero elements of β * is less than ϕðβ omp Þ/λL q . The sparsity of β * is dependent on the choice of λ. A sufficient condition on λ for minimizers of problem (6) to have desirable sparsity can be found in [40].
With the output result by Algorithm 1, we can obtain a high-order polynomial whose coefficient vector is β * to fit the curve as shown in Figure 6(a). The red part is the scatter of recorrection points, i.e., fðx i , y i Þ, i = 1,⋯,mg, and the fitting curve is shown in green. In Figure 6(b), we compare the fitting result and ground-truth in green and red, respectively. Figure 7 shows some of the choroid segmentation results with images in the testing set. These examples display that the regression fitting is visually reasonable and conforms to the ground-truth.
In Figure 8, we give a flowchart to show the process of our choroid segmentation method. The RPE and BM are recognized first by their physiological characteristics. The labeled patches cut from OCT images are used to train a CNN classifier. Then, the input image can be transferred into a
Step 3. Output a numerical solution β * satisfying β * j = β * j , jβ * j j > L, j ∈ Λ 0, otherwise: ( Algorithm 1. Hybrid OMP-SG approach for problem (6).   Computational and Mathematical Methods in Medicine group set of points with a high predicted value by the trained CNN model. After data recorrection by the RANSAC method, the residual points are fitted by a l 2 − l q ð0 < q < 1Þ regression model, and the fitting curve is the desired outer interface of the choroid. Thus, we obtain the segmentation results of the choroid.

Results and Discussion
In this section, we discuss the experimental analysis based on thickness evaluation. We give the measurement criteria of thickness, and then compare the average error, maximal error, and Dice coefficient of our experimental results with the results by the method in [24].

F Hypothesis Testing and T Hypothesis Testing.
We use the Anderson-Darling test to verify both the average thickness marked by clinicians and predicted by our model with approximate normal distribution. Therefore, we can adopt F Hypothesis testing and T Hypothesis testing to analyze the thickness results obtained by our method. By F Hypothesis testing, we can compute the confidence level of the thickness results. Denote X 0 = fx 0 i , i = 1 ,⋯,hg by the thickness sample set in which the element represents an average thickness of choroid given by medical staff in one OCT image. X 1 = fx 1 i , i = 1,⋯,hg is the thickness sample set given by our proposed method. First, we use the F-test to verify whether the variances between the given thickness samples X 0 and our thickness results X 1 are similar.
(i) H 0 : there is no significant difference in the variance between the given sample and our result sample (ii) H 1 : there is a significant difference in the variance between the given sample and our result sample where S 0 and S 1 are the variances of samples in X 0 and X 1 , respectively. The P value of F-test is 0:2092 > 0:05, i.e., H 0 can not be rejected. In other words, it can be ensured that the results by our proposed method are not significantly different from the ground-truth. After verifying the variance, the next step is to verify whether there is a significant difference between the mean values of the two sample sets.
where X 0 and X 1 are mean values of samples in X 0 and X 1 , respectively. The P value of T-test is 0:2898 > 0:05, i.e., H 0 can not be rejected. Therefore, the mean value of our results is not significantly different from the data given by optometrists.

Minimum Distance Method.
In this part, we compare the results of our proposed method and other methods in detail. Since finding the minimum distance point directly in the image will produce a zigzag error, we calculate the distance based on the regression functions of the BM and choroid. Define the set of horizontal ordinate in the result image as S = fx, 1 ≤ x ≤ 600g; CSIðxÞ and BMðxÞ represent the regression function value at x, respectively. The minimum distance method starts from a point ðx, CSIðxÞÞ on the choroidal curve, then finding the corresponding point ðy, BMðyÞÞ on the BM which is closest to point ðx, BMðxÞÞ, and calculating the distance between these two points. As shown in Figure 9, the length of the yellow line is desired.  The thickness function lðxÞ at point x ∈ S is defined as follows: Using this minimum distance method, we can obtain the thickness results by our proposed method. Next, we would like to present some measurements to compare our results with those by other methods.

Average Error and Maximal
Error. In order to evaluate the thickness results, the average error and maximal error in an image are calculated to compare. The average error er r 1 is calculated by where x i = i for i = 1, ⋯, W represents the horizontal coordinate in each image, lðxÞ is the thickness function defined in (12) by our proposed method, l truth ðxÞ is the thickness function provided by the ophthalmologists, and W is the image width (W = 600 in this paper). The maximal error er r 2 is 3.4. Dice Coefficient. Dice coefficient, a metric function in set comparing, is considered to measure the similarity between the segmentation result of the proposed method and the ground truth. It is defined as where sets S pro and S truth consist of the pixels in the segmented choroidal region by our proposed method and the manually labeled choroidal region by experts, respectively. |S | denotes the number of elements in the set S.

Discussion
The details of the comparison in err 1 , err 2 , and Dice coefficient over the whole testing set (76 images) are given in    Table 2. The two-stage segmentation is the method proposed in [24] which is verified that it performs better than some classical approaches such as Graph cut [41], k-means [41], and Graph Search Theory [26]. From Table 2, we can find that our proposed method possesses a smaller average error and larger Dice coefficient than the two-stage segmentation, which implies the effectiveness of the l 2 -l q regression model and other improvements in our method. The comparison in Dice coefficient is visually shown in Figure 10. Yellow and red points represent the average Dice coefficient of our proposed method and the two-stage segmentation, respectively. It is clearly displayed that the Dice coefficient data obtained by our proposed method is generally superior to the data generated by the two-stage segmentation.

Conclusion
In this paper, we propose and implement an automated segmentation method based on CNN classifier and l 2 -l q fitter to detect the region of the choroid. The BM, next to the inner surface of the choroid, is segmented by its physiological characteristic with the recognition of RPE. The extraction of the Choroidal-Scleral interface curve, outer surface of the choroid, is divided into two steps. First, we cut the images into small patches with label "on-line" or "off-line" to train the CNN classifier. The Focal Loss function and ADAM optimizer are used in the process of training the CNN model. Then, a binary classification problem is solved by using this CNN model with input test images. After obtaining the classified data with predicted values, we filter the mistake and noise points by the RANSAC method. In the second step, we adopt a l 2 -l q (0 < q < 1) regression model to fit the discrete and filtered points to generate the desired curve. The hybrid OMP-SG algorithm is employed to solve the l 2 -l q minimization. Segmentation results in some test images are given in Figure 7. Finally, we discuss and evaluate the experimental results in the aspects of average error, maximal error, and Dice coefficient. Comparison details with other methods are shown in Table 2 illustrating the effectiveness of the proposed method. With the help of the proposed method in autosegmentation of choroid from OCT images, the changes of choroidal thickness in response to experimental treatment or diseases could be effectively and accurately evaluated.

Data Availability
The data used to support the findings of this study were supplied by Optometry Research Clinic, School of Optometry, The Hong Kong Polytechnic University, under license and so cannot be made freely available. Requests for access to these data should be made to the Optometry Clinic in the Hong Kong Polytechnic University (Rachel Ka Man Chun, rachel.chun@polyu.edu.hk).