Fast Parabola Detection Using Estimation of Distribution Algorithms

This paper presents a new method based on Estimation of Distribution Algorithms (EDAs) to detect parabolic shapes in synthetic and medical images. The method computes a virtual parabola using three random boundary pixels to calculate the constant values of the generic parabola equation. The resulting parabola is evaluated by matching it with the parabolic shape in the input image by using the Hadamard product as fitness function. This proposed method is evaluated in terms of computational time and compared with two implementations of the generalized Hough transform and RANSAC method for parabola detection. Experimental results show that the proposed method outperforms the comparative methods in terms of execution time about 93.61% on synthetic images and 89% on retinal fundus and human plantar arch images. In addition, experimental results have also shown that the proposed method can be highly suitable for different medical applications.


Introduction
In the pattern recognition field, detection of curves in natural or medical images is a significant and challenging problem since relevant information about an object is highly related to the shape of its boundary. Any curve can be detected by using the Hough transform (HT), if this curve can be represented by a parametric equation [1,2]. Circular Hough transform (CHT) is based on the Hough transform principle and it has been adapted for the detection of circles [3]. For this particular problem, three different parameters ( 0 , 0 , ) that define a circle have to be determined. The parameters ( 0 , 0 ) represent the coordinate of the center and represents the radius [4]. Although the CHT is able to detect suitable approximation to circles in different type of images, the computational time to detect a single curve is high. This disadvantage is due to the fact that each single pixel represents a potential center ( 0 , 0 ), and a range of possible radii have to be tested for each particular pixel. On the other hand, the voting accumulator [5] that indicates the parameters of a curve that exists in the image is computed on a matrix where each cell has the number of intersections of the circles formed by taking a single pixel as the center; if the matrix is plotted in a three-dimensional graph, sharp peaks will be visualized in the cells that have more intersections. The parameters of the circle are computed by taking the highest peak in the accumulator. Since the representation of an accumulator is a matrix, two of the three parameters of the circle are represented by the rows and columns; in order to cover the third parameter, a three-dimensional matrix is 2 Computational and Mathematical Methods in Medicine needed. Although it is possible that several sharp peaks will be formed in the accumulator, when the radius of the tested circle is close to the radius of the circle, these peaks do not represent a real circle. Therefore, once the accumulator has been computed, the next challenge is to find the peak that actually represents the real circle. Due to the nature of the CHT, it would be virtually impossible to find multiple circles in an image; therefore, some studies have proposed different techniques to solve this issue by using techniques such as genetic algorithms [6] or Harmony Search (HSA) [7]. Ayala-Ramirez et al. [6] optimize the computational time of the CHT, by applying a strategy consisting in selecting three random pixels from the edge image as the chromosomes for the genetic algorithm. These three points are used to calculate a center and radius of a circle, and the fitness function evaluates how many pixels of a virtual circle are actually present in the real edge image.
The problem of detecting parabolas can be accomplished by finding the basic parameters of the general equation for parabolic shapes, which are the vertex in ( 0 , 0 ), the angle, and its aperture. In literature, different methods have been proposed to detect parabolas using the Hough transform as a baseline. Maalmi et al. [8] proposed a method that applies genetic algorithms to perform the voting process of the Hough transform. The detected curves are horizontal and the aperture is fixed. This implementation was reported for the detection of crack defects in B-scan images. Oloumi and Rangayyan [9] applied the Hough transform for parabola detection using three parameters: ( 0 , 0 ) that represent the vertex and to describe the latus rectum or aperture. The main application reported of this method is the detection of temporal arcade in retinal fundus images, where parabolas fit retina vessels [10,11]. Other reported works also have been focused on the human eye, for finding the iris [12], and in retina angiography [13].
Another potential application for parabola detection can be seen in medical images for orthopedic diagnostics in the plantar arch. Some of the basic human movements are walking and running; these movements are possible thanks to a complete set of muscles working together. However, if the plantar arch of the human feet is not of the correct size, a set of problems (e.g., back problems) can lead to surgery and prosthetics [14]. In patients with diabetes, it was observed that there is a relationship between forefoot and rear foot pressure with the ulceration on the human foot [15]. Different methods have been proposed to address problems in the orthopedic area. P. S. Kulkarni and V. B. Kulkarni [16] proposed a method to classify the human footprint using a parameter that they call Footprint Index (FPI). First, they found the minimum distance from -axis on the lateral upper and lower part of the footprint; with those two points, they draw a line , where the middle point of will be the center of the arch. Considering this center, two lines and , which are perpendicular to , are computed. These lines will provide the distances from to the arch vertex and from the arch vertex to the edge foot, respectively. The FPI parameter is calculated as the ratio of to . Then, the footprint can be classified as flat foot, normal foot, or high arch foot. Chu et al. [17] proposed a similar process as in [16]; however, some other measurements were computed and were added to the ratios proposed by P. S. Kulkarni and V. B. Kulkarni [16] in an effort to compute the arch height. Zheng et al. [18] used the footprint to perform an analysis and match a subject's gait with the corresponding footprint.
Some other approaches have been proposed for parabola detection. Salehin et al. [19] proposed conic detection by applying Pascal's theorem (i.e., approximating the curve from two tangent lines and a point from the conic). Merazi-Meksen et al. [20] detected parabolic forms from Time-Of-Flight Diffraction images in order to analyze material defects. Detection procedure is named randomized Hough transform and is a combination of Least Squares, Randomized Sampled Consensus (RANSAC), and Hough transform. Certainly, robust fitting may be seen as a nonlinear optimization problem that could be solved iteratively by RANSAC method; in particular, this method is able to cope with outliers to estimate the parameter of a desired mathematical method. RANSAC is a simple and powerful method that could be useful in many applications; depending on the model it could have many parameters to tune, but the probability of convergence increases as more iterations are used. The convergence is not warranted because initialization is chosen randomly from a small data subset (i.e., results are not repeatable). In some special cases, RANSAC is not always capable of obtaining the optimal results for well-conditioned data [21,22].
Most state-of-the-art algorithms use Hough transform for model fitting which is very time demanding. EDAs represent a stochastic optimization technique similar to genetic algorithms, which has begun to attract more attention for solving different problems in the area of image analysis [23,24]. One of the main advantages of EDAs (UMDA) is that they use global statistical information of the best solutions instead of a crossover or mutation operators. On the other hand, UMDA has only two parameters to be tuned, number of individuals and selection rate, since the number of generations can be replaced by another convergence criterion such as the average or standard deviation of the population.
In this paper, a new method for the parabola detection problem based on Estimation of Distribution Algorithms (EDAs) is proposed. The method is evaluated in terms of computational time on synthetic and medical images of the retina and human plantar arch. Since EDAs represent an evolutionary computation technique, the fitness function used in this work is based on the Hadamard product. EDAs have shown remarkable advantages in order to solve optimization and model fitting problems. In our proposed approach, Univariate Marginal Distribution Algorithm (UMDA) [23] performs a single detection of a parabola 94% faster than HT and 45% faster than RANSAC.
Finally, the results of the proposed method are compared with those obtained by using the Hough transform implementation of Sanchez found in the MATLAB5 central [25], the parabolic shapes detection provided by the MIPAV5 [26] software, and additionally the well-known RANSAC method proposed by Fischler and Bolles [27]. The remainder of this paper is organized as follows: in Section 2, the fundamentals of the Hough transform, the Estimation of Distribution Algorithms, and the proposed method for detecting parabolic shapes are explained in detail. Experimental results are presented and discussed in Section 3, and conclusions are given in Section 4.

Parabola Detection.
The detection of curves can be achieved by exploring the duality between points on a curve and the parameters representing that curve; this method is known as Hough transform (HT) [28]. Geometric curves such as circles, ellipses, parabolas, and hyperbolas can be parameterized in a polar coordinate system ( , ), if the equation that represents the curve satisfies one of the following two equations: where represents the eccentricity and is the shortest distance between the focal points and the directrix of the curve. Since the present work is focused on the detection of parabolic curves, in these equations, the parameter must have a value of 1. The most common parameters to identify a parabola are the vertex ( 0 , 0 ), the coordinates ( , ) of the focus, the orientation 0 of the axis of symmetry with respect to the coordinates axes, and the coefficients of (1). On the other hand, the equation in the Euclidean space that represents a parabolic curve with directrix parallel to the -axis can be defined as follows: and if the directrix is parallel to the -axis, the equation is modified as follows: where the parameter is used to modify a parabola in two different aspects: the aperture direction and its magnitude.
For large values of | |, the aperture is increased, and if is positive, the parabola of (2) opens up or to the right, while in (3) the parabola opens to the positive -axis, as it is shown in Figures 1(a) and 1(b), respectively. Hence, a parabolic shape is completely determined in the Euclidean space by defining the set of parameters { 0 , 0 , }.
To detect parabolic shapes in images using the Hough transform algorithm, all the pixels with intensity different to zero and with coordinates ( , ) represent a potential curve in the Hough space. However, a drawback of the Hough transform is the resolution used to generate the accumulator because the input parameters such as the aperture are unknown. Consequently, there is a tradeoff between precision and fast execution time. When the tested aperture is close to the real aperture, the accumulator shows several peaks with high magnitude as it can be seen in Figure 2, and by using a low precision for the aperture parameter, the real parabola may not be found.
The main disadvantages of the HT are the computational time it takes to determine the best parameter values and the selection of the optimal peak in the accumulator, where the most commonly strategy used to find it is the local maxima method.

Estimation of Distribution Algorithms. The Estimation of
Distribution Algorithms (EDAs) represent an extension to the field of evolutionary computation (EC). EDAs are useful to solve problems in the discrete and continuous domain by using some statistical information of potential solutions, also called individuals [29][30][31]. Similar to EC techniques, EDAs perform the optimization task by using binary encoding and selection operators over a set of potential solutions called population. The main difference regarding the classical EC techniques is that EDAs replace the crossover and mutation operators by building probabilistic models at each generation based on global statistical information of the best individuals.
By using these explicit probabilistic models, EDAs are able to solve optimization problems to cope with high level of epistasis. The principal advantages of EDAs over genetic algorithms are the absence of multiple parameters to be tuned and the expressiveness and transparency of the probabilistic model that guides the search process. EDAs have been proven to be better suited to some applications than GAs, while achieving competitive and robust results in the majority of tackled problems [24]. By the way, EDA is a discrete algorithm and PSO has been applied for continuous processes; although PSO has been applied in different works in discrete problems, its real potential is in the continuous domain [32]. PSO is similar to the genetic algorithm (GA) in the sense that these two evolutionary heuristics are population-based search methods by using a combination of deterministic and probabilistic rules. PSO has the same effectiveness as GA but with a significantly better computational efficiency [33]. Additional advantages of using EDAs are that they have the ability to adapt their operators, provide optimization practitioners with a Roadmap of how the problem was solved, use prior knowledge by injecting specific solutions, and require reduced memory and reduced computational times [29]. In this work, the Univariate Marginal Distribution Algorithm (UMDA) has been adopted as optimization strategy, because it is ideal for linear problems [23,34,35]. UMDA uses a binary codification for each possible solution, and it generates a probability vector p = ( 1 , 2 , 3 , . . . , ) , where is the marginal probability of the th bit of each individual, to be one or zero in the next generation. Then, UMDA tries to approximate the probability distribution of the individuals in P. This can be defined as follows: where = ( 1 , 2 , . . . , ) is the binary value of the th bit in the individual and is the th random value of the vector . An objective function is needed to select a subset of best individuals. This function is used to determine the fitness of the current potential solution. A probability vector is computed from the subset of candidate solutions in order to generate a new population based on its distribution. This process is iteratively performed until a stop condition is achieved, and the best solution is chosen to be the individual with the best fitness value along the evolutionary process.

Proposed Method.
This section describes the proposed method to detect parabolic shapes in images using the population-based method of UMDA along with the objective function to evaluate the fitness for each potential solution. The most representative steps of the proposed methodology for parabola detection are represented in Algorithm 1.

Individual Representation.
A parametric equation that describes a parabola is required to represent a potential solution for finding parabolic shapes in images. In the Cartesian coordinate system, a parabola can be described by using its general form as follows: where , , and represent unknown constant values, which can be determined using three independent pixels from the spatial image domain. Considering that the search space is a binary image, the coordinates (cols( ), rows( )) of three random pixels are selected in order to form an individual for the population of the UMDA. For simplicity, the label is used for cols( ) and for rows( ).
Computational and Mathematical Methods in Medicine ← alculateFrequencyOfComponents(Selected) (7) Offspring ← 0 (8) for to PopulationSize do (9) Offspring ← ProbabilisticallyConstructSolution(V) (10) Entr ← EntropyMeasurements(Offspring, Model) (11) end (12) To illustrate the representation of a potential solution in an optimization model fitting process, Table 1 shows an example of an individual formed with the integer indexes {22, 58, 39} represented in base-2 numeral system that can be coded by 8-bits string. This individual is formed by concatenating the binary string indexes { , , } of the three selected pixels in a single vector as it was used by Ayala-Ramirez et al. [6]. The number of genes for each individual directly depends on the number of potential pixels in the spatial image domain.
Since the proposed method uses the indexes to form all the individuals, the UMDA method can easily eliminate unfeasible solutions by using a function to measure the quality of an individual.

Fitness Function.
To evaluate the fitness of an individual, a binary image VS (virtual shape) of the same size that the input image is generated. The value of all the pixels on the virtual image are set to zero, and this leads to a black color image. Then, the vertex and aperture 4 of the parabola represented by (5) are computed using the values , , and obtained from the individual representation. Subsequently, the edge points of the parabola are computed and stored in VS by setting to 1 the corresponding pixels. Finally, the Hadamard product [36] between the binary form binary of the input image and the virtual image is calculated as follows: The resulting image Hd contains only those pixels where binary and VS match. Figure 3(a) illustrates an example with the three points taken from an individual to form a parabola in the virtual shape. It can be seen that the Hadamard product obtains partial information of the real parabola. Figure 3(b) shows an example where the three points taken from the individual are part of a parabola that fits perfectly with the parabolic shape of the input image.
The fitness function used to assess the quality of potential solutions is the number of pixels resulting of the Hadamard product. Given that the binary image of the virtual shape is initialized with all pixels on black (intensity value of zero), with more matching pixels in Hd more white pixels will appear.

Results and Discussion
In this section, the proposed method for parabola detection is applied on synthetic images and medical images of the retina and human plantar arch. In order to assess the proposed method, it is compared with the Hough transform by applying the algorithm of Sanchez found in the MATLAB central [25] and by running the Hough transform for parabolic shapes algorithm provided by the MIPAV [26] software, that can be downloaded from the website [37]. However, considering that MIPAV and MATLAB parabola detection implementations are based on Hough transform, RANSAC algorithm [21] was implemented to analyze the performance of the proposed algorithm (cf . Tables 3 and 4). The computational implementations are performed by using the MATLAB version 2013b, on a computer with an Intel Core i5, 4 GB of RAM, and 2.4 GHz processor. Moreover, computational experiments using UMDA were performed using 30 runs in order to perform a statistical analysis of the stochastic process applying the parameter values presented in Table 2.
These parameter settings were determined based on the solutions that give the best tradeoff between precision (lowest RMSE) and computational time using 30 trials to determine the best set of parameters. Moreover, different works were taken into account such as [23].

Application on Synthetic
Images. The first experiment was performed by using synthetic images like the one in Figure 4(a). This synthetic image was generated by drawing randomly located parametric objects such as lines, circles, and a parabola in order to evaluate the algorithm in a controlled way. Table 3 presents a comparative analysis using Figure 4(a). The method presented in [25] shows the largest execution time, given that the result is the execution time per pixel and this image contains 125,899 pixels. Hence, this method Table 3: Comparative analysis of execution time using the binary image (Figure 4(a)).

Method
Execution time (s) Number of iterations  [26] 37.68 -RANSAC [27] 16.35 5000 Table 4: Comparative analysis of execution time using the skeleton of the binary image (Figure 4(c)).  [26] 36.47 -RANSAC [27] 14.86 5000 depends entirely on the number of potential pixels. The proposed method performs a reduction of 93.61% of the execution time achieved with the MIPAV software, given that the UMDA method has 10 individuals and taking into account the mean of the number of iterations, only 195 function evaluations are required to obtain the best result as it is shown in Figure 4(b).

Method
To ensure that the algorithm is robust and results are consistent with different input conditions, the complexity of the image was increased by adding "salt and pepper" noise. The test was performed by obtaining the skeleton of Figure 4(a) and adding a 10% of "salt and pepper" noise, as it is shown in Figure 4(d). In Table 4, a comparison of the three methods using Figure 4(d) is illustrated. Since the method in [25] works over a single pixel, the result is consistent with Table 3. The proposed method achieves a reduction of 96.14% in comparison with the MIPAV software. Considering that the amount of pixels is lower than the binary image, the number of iterations is congruent with the reduction in the execution Computational and Mathematical Methods in Medicine 7  time. In this test, the average number of function evaluations to obtain the best result using UMDA was 173 (Figure 4(d)).
In order to quantify the fitness of detected parabolas, ten synthetic images with parabolas were generated using (5) with random values of , , and . Salt and pepper noise in the regular range of [0, 25%] was added to the images giving a total of 260 test images. Normalized root-mean-square deviation (RMSD) or root-mean-square error (RMSE) was computed to compare original parabolas to the one detected by the proposed algorithm, according to the following: To obtain a true normalized version, (9) can be divided by max( real ), mean( real ), or Cardinality( real == ). In our experiments, the Cardinality was used to prioritize the fitting curves having maximum number of matched points. A test by adding "salt and pepper" noise over the range [1,25] percent to Figure 4(c) (skeleton image) was carried out in order to evaluate the performance of the proposed method in terms of computational time with different amounts of noise. Figure 5(a) shows the average execution time of the obtained results. Since the image has fewer potential pixels than the binary image, the execution time with 1% of noise was low in comparison with previous results. Considering that the UMDA method is a stochastic algorithm, its performance is not shown as a straight line; however, the mean time tends to increase when the amount of noise is also increased. Figure 5(b) shows the standard deviation of the performed tests; since the graph tends to a constant mean over all the percentages of noise, it can be assumed that the algorithm is stable and robust in the presence of noise over the time (i.e., noise may affect quality results). This time stability is originated by the normalized RMSD metric, because more noisy pixels are considered as part of the detected parabola while the global error is also increasing. However, it is observed that most methods (RANSAC, MIPAV, and Hough transform) fail to detect correctly the parabola parameters on images having more than 25% of salt and pepper noise, and the proposed method could detect parabolas on synthetic and real images contaminated until 30%. Figure 6(a) is a synthetic image for fitness testing, and Figures 6(b) and 6(c) present detected parabolas using our proposed algorithm and MIPAV software, respectively. RMSD values for both times of individual detection are 0.0026 and 0.0089, demonstrating the validity of our method. In Figure 7, comparative results between the proposed method and RANSAC are depicted, and Table 5 shows the normalized errors in parabola detection on original images and with 15% of noise. The fast and stable performance of the proposed algorithm compared to RANSAC is noticed; besides, RANSAC has many parameters to tune and is computationally expensive and there is no warranty for reproducing the same results in a new evaluation.

Retinal Fundus
Images. The detection of parametric objects has been applied in different areas of engineering. In medical imaging, the detection of parabolas in retinal fundus images has a particular importance, since the form of the retinal vessels can be approximated to the parabola parametric form. In the previous work, the standard model, the Hough transform, has been used to find the parabola that fits the best on the retinal images. For instance, in the work reported by Oloumi et al. [10,11], the form of the retina vessels is approximated to a parabola with the aim of monitoring the measurements of the openness of the major temporal arcade (MTA). This study facilitates the quantitative analysis of the MTA and overcomes the limitations associated with manual analysis. Another application is reported by Yu et al. [38], where the retinal images are used to find the vertex of all the vessels; this vertex is known as fovea [39]. This study is significant because the position of the vertex can give the grade of diabetic retinopathy.
In the tests reported in the present work, the proposed method was used to approximate the retinal vessels to a parabola. Figure 8 shows the results obtained, where the best fit for the parabola in the image is shown in red color. As in the work reported by Yu et al. [38], the public database DRIVE [40] that contains a set of 20 retinal fundus images was used. Images were preprocessed by applying Gaussian matched filters for vessel detection. Table 6 presents the statistical analysis of the results obtained by applying the proposed and comparative methods to the DRIVE database. The method presented in [25] remains constant with the execution time per pixel; then, this execution time for this algorithm is the largest time. Since the proposed method accomplishes the best result with only 165 function evaluations, it is possible to achieve a reduction in terms of computational time of 89% in comparison with the one performed on the MIPAV software.  [26] 43.65 -RANSAC [21] 16.66 5000 Given that the skeleton of an image passes through the center section of a set of pixels, if the proposed algorithm is applied to the skeleton of the retinal images, it is then expected for the algorithm to achieve a lower execution time than in the binary image. In Table 7, the statistics of the execution time for the skeleton images are shown. The results are consistent with those obtained for the synthetic image, and as it can be seen, the proposed method clearly outperforms the execution time of HT algorithm from Sanchez and the MIPAV software.   [26] 41.87 -RANSAC [21] 12.99 5000 Since the proposed method takes three pixels to compute a parabola, it can be assumed that the number of iterations to achieve the best result is not related to the size of the image. This assumption is validated with the results shown in Tables  6 and 7, where the average number of iterations for both the binary image and the skeleton image is less than 30.

Plantar Arch
Images. The form of the plantar arch in humans can conduct to several conditions if the contact area with the floor, given by the plantar arch, is not of the correct size. In clinical practice, the procedure for providing a diagnosis of flatfoot degree is by visual inspection of the specialist. Therefore, the proposed method in this work can be applied to plantar arch images to approximate the heel and the plantar arch to parabolas; hence, these parameters could be used to assist a medical diagnostic. The image database used in this test was created by the authors and approved by a specialist (Dr. Carlos Reséndiz Ramírez). This database is composed of 80 images of size 285 × 707 pixels in RGB color space of left and right foot collected from different patients. Images were first preprocessed, to reduce the amount of data that was not useful for this study, as described below.
Let be a grayscale image of the foot to be analyzed, with and as the number of pixels for the width and height, respectively; then, The first step of the method consists in smoothing the image by applying a mean filter, that is, convolving the image with a filter mask: This filter removes part of the spurious pixels, improving the probability of finding the curves of interest. After the smoothing step, gradients are computed on the image in order to find the edges of the footprint. Canny edge detection is the algorithm used to perform this step. The resulting 10 Computational and Mathematical Methods in Medicine The binary image is the result of the preprocessing stage in the proposed method. This image is referred to as the edge image.
In order to address the plantar arch issue, the foot image was divided into three sections from the fingers to the heel, where each section contains 33.33% of the image. The sections of interest for this study are only those two that include the plantar arch and the heel. Table 8 presents a comparative analysis of the execution time of the algorithms over the database of the human plantar arch images. As it can be observed, the method developed in [25] maintains the average time per pixel, and so, it takes the largest execution time. The proposed method performs a reduction of 89.96% of the execution time achieved on the MIPAV software; this is because the algorithm requires only an average of 135 function evaluations to obtain the best result. Figure 9 shows a subset of human plantar arch images,   [26] 63.78 -RANSAC [21] 22 where the heel and plantar arch were parameterized with the proposed method.
In addition, the comparative results in Table 8 show that the proposed method outperforms the HT algorithm from Sanchez and the algorithm from the MIPAV software in terms of computational time. Figure 9 presents the results of the proposed method in different plantar arch images, where the rows, from (a) to (d), show the real images, the binary images, the edge images with the detected parabolas, and the output of the proposed method.
Finally, Figure 10 shows the average fitness value through the generations of UMDA over the 30 runs of the proposed method. This graph represents the iterations that were required by the UMDA method to find the two parabolas in the image.

Conclusion
In this paper, a new method based on the Estimation of Distribution Algorithms (EDAs) has been proposed to detect parabolic shapes. The method computes the constant values of the generic parabola equation by selecting three random pixels from the input image. The proposed method was evaluated in terms of computational time and compared with freely available implementations of the parabola Hough transform. According to experimental results, the average time of the proposed method is significantly better (0.0036 seconds) compared to those obtained by the Hough transform, since this method takes an average of 6.14 seconds to compute the accumulator for one single boundary pixel. In addition, experimental results have also shown that the proposed method outperforms the two comparative methods in terms of execution time saving 93.61% on synthetic images and 89% on the medical images of retinal fundus and human plantar arch.