An Automatic Traffic Sign Detection and Recognition System Based on Colour Segmentation , Shape Matching , and SVM

Themain objective of this study is to develop an efficient TSDR systemwhich contains an enriched dataset ofMalaysian traffic signs. The developed technique is invariant in variable lighting, rotation, translation, and viewing angle and has a low computational time with low false positive rate. The development of the system has three working stages: image preprocessing, detection, and recognition.The system demonstration using a RGB colour segmentation and shape matching followed by support vector machine (SVM) classifier led to promising results with respect to the accuracy of 95.71%, false positive rate (0.9%), and processing time (0.43 s). The area under the receiver operating characteristic (ROC) curves was introduced to statistically evaluate the recognition performance. The accuracy of the developed system is relatively high and the computational time is relatively low which will be helpful for classifying traffic signs especially on high ways around Malaysia. The low false positive rate will increase the system stability and reliability on real-time application.


Introduction
In order to solve the concerns over road and transportation safety, automatic traffic sign detection and recognition (TSDR) system has been introduced.An automatic TSDR system can detect and recognise traffic signs from and within images captured by cameras or imaging sensors [1].In adverse traffic conditions, the driver may not notice traffic signs, which may cause accidents.In such scenarios, the TSDR system comes into action.The main objective of the research on TSDR is to improve the robustness and efficiency of the TSDR system.To develop an automatic TSDR system is a tedious job given the continuous changes in the environment and lighting conditions.Among the other issues that also need to be addressed are partial obscuring, multiple traffic signs appearing at a single time, and blurring and fading of traffic signs, which can also create problem for the detection purpose.For applying the TSDR system in realtime environment, a fast algorithm is needed.As well as dealing with these issues, a recognition system should also avoid erroneous recognition of nonsigns.
The aim of this research is to develop an efficient TSDR system which can detect and classify traffic signs into different classes in real-time environment.For detecting the red traffic signs, a combination of colour and shape based algorithm is presented which will up the procedure of the detection stage and for recognition SVMs with bagged kernels are introduced.
This paper is organized as follows: Section 2 presents the related works in the field of development of the TSDR system.In Section 3, the overall methodology is discussed.The experimental results and discussions are summarized in Section 4. In Section 5, the conclusion and some suggestions are made for future improvement on the field of automatic traffic sign detection and recognition.

Related Work
According to [2], the first work on automated traffic sign detection was reported in Japan in 1984.This attempt was followed by several methods introduced by different researchers to develop an efficient TSDR system and minimize all the issues stated above.An efficient TSDR system can be divided into several stages: preprocessing, detection, tracking, and recognition.In the preprocessing stage the visual appearance of images has been enhanced.Different colour and shape based approaches are used to minimize the effect of environment on the test images [3][4][5][6].The goal of traffic sign detection is to identify the region of interest (ROI) in which a traffic sign is supposed to be found and verify the sign after a large-scale search for candidates within an image [7].Different colour and shape based approaches are used by the researchers to detect the ROI.The popular colour based detection methods are HSI/HSV Transformation [8,9], Region Growing [10], Colour Indexing [11], and YCbCr colour space transform [12].As the colour information can be unreliable due to illumination and weather change, shape based algorithm is introduced.The popular shape based approaches are Hough Transformation [13][14][15], Similarity Detection [16], Distance Transform Matching [17], and Edges with Haar-like features [18,19].The tracking stage is necessary to ensure real-time recognition.In addition, the information provided by the images of the traffic signs will help verify the correct identification and thus detect and follow the object [20].The most common tracker adapted is the Kalman filter [18,21,22].
Several methods have been used by the researchers for recognizing traffic sign.Ohara et al. [23] and Torresen et al. [24] used the Template Matching technique, which is a fast and straightforward method.Genetic Algorithm is used by Aoyagi and Asakura [25] and de la Eccalera et al. [26] which is said to be unaffected by the illumination problem.The main advantage of the AdaBosst is its simplicity, feature selection for large dataset, and generalization [27].Li et al. [28] used Adaboost learning containing five classical Haar wavelets and four HoG (Histogram of Oriented Gradient) features.Greenhalgh and Mirmehdi [29,30] showed a comparison between SVM, MLP, HOG-based classifiers, and Decision Trees and found that a Decision Tree has the highest accuracy rate and the lowest computational time.Its accuracy is approximately 94.2%, whereas the accuracy of the SVM is 87.8% and that of MLP is 89.2%.Neural Network is flexible, adaptive, and robust [31].Hechri and Mtibaa [12] used a 3-layer MLP network whereas Sheng et al. [32] used a Probabilistic Neural Network for the recognition process.Support Vector Machine (SVM) is another popular method used by the researchers which is robust against illumination and rotation with a very high accuracy.Yang et al. [33] and García-Garrido et al. [34] used SVM with Gaussian Kernels for the recognition whereas Park and Kim [35] used an advanced SVM technique that improved the computational time and the accuracy rate for gray scale images.
For improving the recognition rate of the damaged or partially occluded sign, Soheilian et al. in [36] used template matching followed by a 3D reconstruction algorithm.The distortion-invariant fringe-adjusted joint transform correlation (FJTC) was used by Khan et al. in [37] and Principal Component Analysis (PCA) is used by Sebanja and Megherbi in [38] which have a very high accuracy rate.In [39], Prieto and Allen used a self-organizing map (SOM) for recognition whose main idea was to apply SOM at every level of RSs with a hit rate of 99%.
In our approach, for reducing the processing time RGB segmentation and shape matching based detection and SVM with bagged kernel are used for recognizing the red traffic signs.Grey-scale images are used to make our detection and recognition algorithm more robust to changes in illumination.

Image Acquisition.
The samples are collected from an inexpensive on board camera (Canon SX170 IS) which is connected to a laptop placed inside of a vehicle (Figure 1).The images were taken in different roads and highways in Malaysia under various weather conditions (Table 1) from 8:00 A.M. to 8:00 P.M. after every two seconds.The camera is placed in the left side of the dashboard so that it can capture the traffic sign of left side.The aim of this section is to create a database of traffic sign images under different variations.

Image Preprocessing.
Image preprocessing is an important part of the TSDR system whose main idea is to remove low-frequency background noise, normalising the intensity of the individual particles images, removing reflections,  and masking portions of images.Below is a description of selected image preprocessing techniques.The input image is divided into channels R, G, and B separately.In the proposed approach, filters are applied on each channel threshold to select those regions of the image where the values of the pixels fall in the range of our target object.For example, for traffic signs with a red background (such as stop signs), the threshold for channel R is pixels with values in the range of 90-255 and for channels G and B the range is 0-70.The region of interest (ROI) is the logical sum of the three filtered channels of R, G, and B.

Shape Matching Based Detection.
The idea is to use colour characteristic of the preferred object to accelerate the procedure without employing model-based classifiers which is a time consuming process [40][41][42].After filtering and analysing the features of the detected object, the candidates of the traffic sign are selected based on shape matching.The flow chart of the system is shown in Figure 2.

Objects Features Analysing.
One of the important steps is to eliminate noise from the image therefore to better deal with the ROI.Appropriate filters have an enormous effect on accuracy and speed of the procedure without deleting any useful information.In the proposed system, for image smoothing and filling up the smaller region to extract the region of interest, a median filter was used.

Shape Matching and Candidate Selection.
As almost all traffic signs containing red colour are round or octagonal, the proposed method drew on these common shapes to detect hypothetical shapes which are close to traffic signs.Those regions with  in the range of 0.7-1.3 are accepted as candidates for traffic signs: where  is the area of the region and  is the longest width.

Traffic Sign Detection.
The area range for road signs determines the distance in which the system can detect the traffic sign.Outside of this range, objects with the same range of pixels value cannot be traffic signs.In this level, crucial information such as centre, area, and longest width of each region is calculated.This information is used to decide whether or not each region is a traffic sign.The detected traffic sign blob images are then passed to SVM for recognition.
3.4.Support Vector Machine (SVM) Based Recognition.After the detection of traffic sign, the region of interest (ROI) is passed to the SVM for recognition.The SVM is one of the most successful kernel methods with a given labeled training dataset {(  ,   )}  =1 , where   ∈   and   ∈ {−1, +1}.In the semisupervised SVM, the total image is clustered for building the bagged kernel.Then, the modification of the base kernel is done.A number of SVMs are trained separately using a bootstrap algorithm and are then aggregated via a suitable combination technique.A bagged kernel is a kernel function encoding the similarity between unlabeled samples [43].For training sample  and given dataset PQ, the bootstrapping is built  replicate training datasets {   |  = 1, 2, . . ., } by random resampling but replacing the values of given dataset PQ repeatedly.For a dataset , kernel methods calculate the comparison between training samples  = {  }  =1 , using pair-wise inner products between mapped samples.Thus the final kernel matrix is   = (  , ) = ((  ), (  )).For the dataset formation, a set of 400 traffic sign images are generated from 100 traffic sign samples and for nonsign images a set of 1000 images are generated from 250 nonsign samples; those are collected randomly by a camera attached with a car in different times of the day and varying weather condition.It also includes partially occulted, slightly damaged, faded, and blurred signs for making the system more successful in real-time environment.All the candidates are scaled down to 25 × 25 pixels and in each step in 1.2 factors to smooth the progress of the features extraction process.
The proposed algorithm is discussed in the following steps.
(1) The computation of the base SVM kernel  SVM is done.
(2) The -means algorithm with various initializations is performed at  times but with the similar number of clusters .The result is  = 1, 2, . . .,  cluster assessments   (  ) for each sample   .
(3) A bagged kernel  bag is built based on the time fraction between   and   and assigned to the same cluster where [  (  ) =   (  )] returns "1" if samples x  and x  belong to the same cluster according to the th realization of the clustering (⋅) and "-1" otherwise.
(4) Consider the sum or the product between the original and bagged kernels, (5) With the resultant modified kernel (x  , x  ), an SVM is trained.The flow chart of the overall SVM with bagged kernel is showed in Figure 3.
Different outcomes are obtained from step (2) because the means give various solutions in each process.In the semisupervised setting, a reduced dataset is used to compute the cluster centres.The test pixels can be assigned to the nearest cluster in each of the bagged runs to compute  bag (x * , x  ).This way, the assignment can be done sequentially or can be parallelized, and only the cluster centres have to be maintained.Intensity correction and histogram equalization are applied to the standard traffic sign images for reducing the effect of variable lighting and illumination and then used to train the SVM.

Performance of Image Preprocessing.
For saving the storage capacity and reducing the computational complexity, the original images are scaled down into 250 × 250 pixels.In the proposed approach, after the image acquisition process described in Section 2, the image preprocessing is performed by the RGB segmentation approach.In the proposed approach, a filter is applied on each channel threshold field to select just those regions of the image where values of the pixels are in the range of the target object.The region of interest (ROI) is actually the logical sum of the three filtered channels of R, G, and B, as shown in Figure 4.The median filter is applied for image smoothing and filling the smaller regions of the image, which is shown in Figure 4(f).

Performance of Traffic Sign Detection.
The final selected candidates such as range of pixel values, area, and shape are drawn on the image by using extracted data (centre and area) of each of them.In the proposed method, only consider those traffic signs containing red colours.After applying shapematching technique for the images containing the nontraffic signs, the output is given that "no road sign is detected."The result has been classified into four sections.False positive (FP) is where the sign is not detected correctly.For the false negative (FN), the sign is detected as a nonsign region.True positive (TP) is defined as the sign is correctly detected and in the true negative (TN), a nonsign region is correctly recognised as a nonsign region.The contingency matrix of the detection performance is given in Table 2.
From Table 4, the sensitivity and specificity values are calculated.Sensitivity is defined as the ability of identifying a condition correctly whereas specificity is defined as the ability of excluding a condition correctly: (ii) Specificity = TN/(TN + FP) = %100.
In the tests, it has been concluded that several problems affected the detection performance.Variant lighting conditions, occultation, and illumination of traffic signs are the main reasons of the false detection.The outcome of the proposed detection method shows that the red colour of the traffic sign is segmented and unswervingly illuminated by the sun.This happens because of the property of the colour segmentation using RGB model involved in comparing the RGB values.In the developed system, the computational time is around 0.25 s and the accuracy rate is 94.85%.Figure 4 shows the detection steps of the traffic sign detection system.The result of our detected traffic sign is given in Figures 5 and 6.In Figure 5, first and second columns show the true positives and true negatives, respectively.Third column shows the false negatives.4.3.Performance of Recognition.In the proposed system, after the colour segmentation and shape matching, semisupervised SVM is applied and the total image is clustered for building the bagged kernel.After that, the modification of the base kernel is done.A number of SVMs are trained separately using a bootstrap algorithm and then they are aggregated via a suitable combination technique.Intensity correction and histogram equalization are applied to the standard traffic sign images for reducing the effect of variable lighting and illumination and then used to train the SVM.A total number of 350 images consist of two different shapes which are circular and octagonal, respectively.Table 3 shows the database used to train the NN.Table 4 shows the final recognition results after the colour segmentation and shape matching technique are applied.Among the 123 traffic signs, 79 signs are octagonal considered as G1 and 44 of them are circular traffic signs considered as G2.Among the 44 traffic signs, there are also two different classes such as "no parking" and "do not enter" signs.
According to this data, evaluation parameters are sensitivity or recall, specificity, precision or PPV, FPR, and accuracy rate (AR) based on the number of FP, FN, TP, and TN values as follows: The overall accuracy of the traffic sign recognition is 95.71% whereas the accuracy of the detection phase is 94.85%.According to the data analysis, the TPR is 89.43% whereas the FPR is 0.009%.Octagonal or "BERHENTI" sign has the highest recognition rate of 94.94% and "no parking" sign has the lowest rate of 84.09%.The processing time of the recognition system is 0.18 s.The overall processing time of the TSDR system is 0.43 s.To evaluate the system performance the ROC curve and the area under the curve are shown in Figure 7.

Performance Comparison of SVM Based Recognition
System.A comparison of previous studies in detecting the traffic sign is given in Table 4. From Table 5, it can be observed that SVM used in [44] has the highest recall rate with an overall good accuracy of over 90%.327 signs out of 340 signs are correctly classified.MSER and HOG based SVM used in [30] had the highest overall accuracy of 97.6% with a false positive rate of 0.85 and 92 signs out of 104 signs are classified correctly.In the proposed system, the lowest false positive rate was 0.009 and accuracy 95.71%.The precision is 98.21% and recall is 89.43%.112 signs among the 123 detected signs are classified correctly.The proposed method has the highest precision rate (98.21%) and lowest FPR (0.009).The accuracy of the proposed method is 95.71%, which is good compared to other systems.
The main limitation of the developed system is that it is only applicable for red traffic signs.The "warning sign" and the "prohibitory sign" contain red which is the most important sign as they are more responsible for traffic accidents.The proposed method has a low detection rate as colour tends to be unreliable due to various factors like       illumination, variable lighting, blurring, and fading.That is why the recognition process is also affected in terms of overall accuracy rate.Another limitation is the lack of images in the Malaysian traffic sign database.The overall processing time is 0.43 s, which is still in the higher side compared to [44].To recognize all types of signs in Malaysia, reduce the processing time, and improve the Malaysian traffic sign database can be proposed as a future work.

Conclusion
The goal of this research is to develop an efficient TSDR system based on Malaysian traffic sign dataset.In the image acquisition stage, the images were captured by an on board camera under different weather conditions and the image preprocessing was done by using RGB colour segmentation.The recognition process is done by SVM with bagged kernel which is used for the first time for traffic sign classification.The developed system has shown promising results with respect to the accuracy of 95.71%, false positive rate (0.009), and processing time (0.43 s).The recognition performance is evaluated by using ROC curve analysis.The simulation results are compared with the existing methods showing the correctness of the implementation.

2 MathematicalFigure 1 :
Figure 1: Model for sample collections (a) used car with a camera placed on the left side of the dashboard, (b) camera setup including a laptop, and (c) on road camera range and sign detection.

Figure 2 :
Figure 2: The overall block diagram of the detection system.

Figure 3 :
Figure 3: Flow chart of parallel SVM with bagged kernel.

Figure 4 :
Figure 4: Colour processing for traffic sign detection: (a) original image, (b) R channel after threshold, (c) G channel after threshold, (d) B channel after threshold, (e) logical sum of three channels, and (f) ROI after filtering and smoothing.

Figure 5 :
Figure 5: Examples of TP in variant lighting conditions (a), (b), and (c); example of TN in variant lighting conditions (d), (e), and (f); and examples of false detection (g), (h), and (i).

Figure 6 :
Figure 6: Final detection: (a) sample traffic sign, (b) closed curve obtained by colour thresholding, (c) after filtering and smoothing the candidate, and (d) detected ROI after shape matching and candidate selection.

Table 3 :
Modification of traffic signs used to train NN.

Table 1 :
Environmental condition for image acquisition.

Table 4 :
Example of traffic signs used to train NN.

Table 5 :
Comparison between proposed method and several existing methods.