1. Introduction

MPE

Mathematical Problems in Engineering

1563-5147 1024-123X

Hindawi

10.1155/2018/5754604

5754604

Research Article

A New Efficient Approach to Detect Skin in Color Image Using Bayesian Classifier and Connected Component Algorithm

http://orcid.org/0000-0003-2635-5371

Nguyen-Trang

Thao

¹ ² Olivares

Alberto

Division of Computational Mathematics and Engineering

Institute for Computational Science

Ton Duc Thang University

Ho Chi Minh City

Vietnam

tdt.edu.vn

Faculty of Mathematics and Statistics

Ton Duc Thang University

Ho Chi Minh City

Vietnam

tdt.edu.vn

2018

682018

2018 10 02 2018 08 07 2018 16 07 2018 682018

2018

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Skin detection is an interesting problem in image processing and is an important preprocessing step for further techniques like face detection, objectionable image detection, etc. However, its performance has not really been high because of the high overlapped degree between “skin” and “nonskin” pixels. This paper proposes a new approach to improve the skin detection performance using the Bayesian classifier and connected component algorithm. Specifically, the Bayesian classifier is utilized to identify “true skin” pixels using the first posterior probability threshold, which is approximate to 1, and to identify "skin candidate" pixels using the second posterior probability threshold. Subsequently, the connected component algorithm is used to find all the connected components containing the “skin candidate” pixels. According to the fact that a skin pixel often connects with other skin pixels in an image, all pixels in a connected component are classified as “skin” if there is at least one “true skin” pixel in that connected component. It means that the “nonskin” pixels whose color is similar to skin are classified as “nonskin” when they have the posterior probabilities lower than the first posterior probability threshold and do not connect with any “true skin” pixel. This idea can help us to improve the skin classification performance, especially the false positive rate.

1. Introduction

Skin detection is an indication of the presence of a human skin in a digital image by converting the original image to a binary image in which “1” represents a “skin” pixel and “0” represents a “nonskin” pixel. It is a very interesting problem as well as an important preprocessing step for further techniques like face detection, hand gestures detection, semantic filtering of web contents, etc. [1–4].

So far, two major groups of methods have been developed for solving this problem using either color or texture features [5]. In comparison to texture-based skin detection, color-based skin detection is usually studied more by researchers, and most of the state-of-the-art skin detection algorithms are color-based [6]. The majority of color-based skin detection algorithms are based on two issues: (i) color space and (ii) classification method. For (i), many color spaces like RGB, HSV, YCbCr, YIQ, YUV, etc. [6–13] were successfully applied to skin detection problem. Some studies concluded that the skin detection performance can be improved when using two different color spaces together [14–16]. According to the above studies, this paper also applies a combined color space, RGBUV, which was proven to be effective by [14], to skin detection problem. For (ii), to classify whether a pixel is skin or not, the majority of previous studies usually focused on two groups of methods: the thresholding and the machine learning methods. The thresholding method is to define a fixed boundary between the “skin” and the “nonskin” region. If the color of a pixel falls into the “skin” region, it is classified as “skin” and vice versa. Some studies that applied the thresholding method to skin detection can be referred to as in [14, 17–20]. In short, the thresholding method gains an advantage because it is a very basic and understandable method; however, it is mainly based on subjective experience and has low performance when the thresholds are incorrectly tuned [1, 21]. The machine learning method detects “skin” pixels by building a predictive model from the input data. Such models, like Bayesian classifier, linear discriminant analysis, binary logistic regression, adaptive neurofuzzy inference system, etc., were successfully applied to skin detection [7, 22–26]. Among them, the Bayesian classifier is especially noteworthy not only in the field of skin detection but also in other disciplines because it provides the information concerning the probability that an observation belongs to a class, thereby evaluating the reliability of the result [27–29]. However, the Bayesian classifier, as well as other methods, still suffers from low performance, especially the high false detection rate (the percentage of nonskin classified as skin). The main causes leading to such low performance and high false detection rate are the confusing background, the noise like skin pixels, and the various conditions of skin color with respect to different ages, sex, races, and body parts [14, 30]. Figure 1 shows the distribution of “skin” and “nonskin” pixels in a particular image, using U and V color channels. In Figure 1, the green points, the red points, and the black region represent the skin pixels, the nonskin pixels, and the skin region established by the Bayesian classifier, respectively. It can be seen that the nonskin pixels have a high volume and overlap with the skin pixels. Obviously, the skin region built by the Bayesian classifier is not robust enough to detect all skin pixels; this region even contains numerous nonskin pixels and provides a high false positive rate. Therefore, a new efficient method that can detect most of the skin and reduce false positive pixels is a necessary demand of the skin detection problem.

Figure 1

The skin and nonskin region in a certain image.

The main contribution of this paper is to propose a new approach for the skin detection using the Bayesian classifier and the connected component algorithm. First, the Bayesian classifier is used to compute the posterior probability that a pixel belongs to the skin class. Normally, the Bayesian classifier assigns a pixel to the skin class if its posterior probability is larger than 0.5. It leads to a high false positive rate because of the high overlapping degree between two regions as illustrated in Figure 1. In the proposed method, a high posterior probability threshold, ε1≈1, is utilized so that we can identify the “true skin” pixels and decrease the false positive rate as much as possible. The Bayesian classifier, in addition to finding the “true skin” pixels, also finds “skin candidate” pixels through another posterior probability threshold ε2. Next, the connected component algorithm is utilized to find all connected components containing the “skin candidate” pixels. With the idea that a skin pixel is believed to connect to another skin pixel, the connect components that contain the “true skin” pixels are classified as skin and vice versa. Obviously, the above condition requires a skin candidate pixel connected with at least one “true skin” pixel. The confusing background and the noise like skin pixels that do not match the condition will be, therefore, classified as nonskin, thereby improving the classification performance especially in terms of false positive rate.

The remainder of this article is organized as follows. Section 2 presents the preliminary explanations of the Bayesian classifier and connected component algorithm. The proposed method is introduced in Section 3 and illustrated and applied in Section 4. Section 5 is the conclusion.

2. Preliminary Explanations 2.1. Bayesian Classifier

We consider k classes, w1,w2…,wk, with the prior probability qi, i=1,k¯. X={X1,X2…,Xn} is the n-dimensional continuous data with x=x1,x2…,xn being a specific sample. According to [31, 32], a new observation x belongs to the class wi if and only if (1)Pwi∣x>Pwj∣x for 1≤j≤k, j≠i.

In the continuous case, P(wi∣x) is calculated by(2)Pwi∣x=Pwifx∣wi∑i=1nPwifx∣wi=qifixfx

Because f(x) is the same for all classes, the classification’s rule is(3)qifix>qjfjx⇔fixfjx>qjqi, ∀j≠i.Here

qi is the prior probability of class i;

fi(x) is the probability density function of class i.

In the case of two classes like the skin detection problem, the new observation x belongs to the class w1 if and only if q1f1(x)>q2f2(x) or P(w1∣x)>0.5 and vice versa.

2.2. Connected Component Algorithm

When processing binary images, we often expect to group the pixels, which have values of 1, into the maximally connected regions. These regions are called the connected components of the binary image. Mathematically, two pixels p and q belong to the same connected component C if there is a sequence of pixels, which have values of 1, p0,p1,...pn in C such that p0≡p, pn≡q, and p_i is a neighbor of p_i-1 where the neighbors are defined using either 4 connected or 8 connected regions as shown in Figure 2.

Figure 2

(a) 4 connected neighborhoods; (b) 8 connected neighborhoods.

(a) (b)

This paper applies the connected component algorithm [33] that consists of two stages with the left-to-right, top-to-bottom scan order. In the first stage, the algorithm assigns a new label to the first pixel of each component and attempts to propagate the label of a pixel to its neighbor to the right or below it. This process is illustrated in Figures 3(a), 3(b), and 3(c). Figure 3(a) presents the considered binary image. In the first row, two pixels, which have values of 1, are separated by three pixels, which have values of 0. Therefore, the first pixel is assigned label 1 and the second pixel is assigned label 2 (label is represented by the red color to distinguish it from pixel value). In the second row, the first pixel valued 1 is labeled as 1 because it has a neighbor labeled as 1. In the same manner, the second pixel valued 1 is assigned label 2. The above process is repeated until the last pixel is assigned a label. In case of the pixel A, the considered pixel has two neighbors with different labels; we assign the smallest label to pixel A (label 1) and denote “equivalent label” for all pixels that have the remaining label. At the end of stage 1, we get Figure 3(c). In stage 2, the pixels labeled “equivalence label” are considered. If a pixel has any neighbor labeled “equivalent label”, we label the pixel as “equivalent label” and vice versa. In the end, we get the final connect components as Figure 3(d). For more details of the algorithm, please refer to [33].

Figure 3

An illustration of connected component algorithm.

(a) (b) (c) (d)

3. The Proposed Method 3.1. Preprocessing

For building the Bayesian model and computing the posterior probability, the Skin Detection Dataset’ downloaded from https://archive.ics.uci.edu/ml/datasets/skin+segmentation is used as the training set. The dataset comprises 50859 skin and 194198 nonskin samples. Available features are pixel’s values in B, G, and R channels. As mentioned earlier, the RGBUV color space is used in this paper; hence, for building the training set, we have to compute U and V values using the following formula. (4)YUV=0.299+0.587+0.114-0.147-0.289+0.436+0.615-0.515-0.100RGB+0128128.

3.2. The Proposed Method

Let x be a vector containing the pixel values in R, G, B, U, and V channels. We need to classify whether x is the skin or not. For this purpose, the new method is proposed, involving the following steps.

Step 1.

Compute the posterior probability that the pixel belongs to the skin class, Pskin∣x, using Bayes theorem.(i)

If Pskin∣x>ε1 then the pixel is labeled as “true skin”.

(ii)

If Pskin∣x>ε2 then the pixel is labeled as “skin candidate”.

Step 2.

Find all connected components containing “skin candidate” pixels.

Step 3.

Classify the pixels with the following rule: if the connected component contains at least one “true skin” pixel, then all pixels belonging to that component are classified as “skin” and vice versa.

In the above algorithm, in order to control the false positive rate at a low level, we choose ε1≈1. For ε2, the detection rate of the proposed method is equal to or less than that of Bayesian classifier if ε2=0.5. Therefore, a value of threshold ε2 that is slightly less than 0.5 will increase the detection rate of the algorithm. The effect of thresholds ε1 and ε2 on the classification performance will be discussed in more detail in Section 4.1.

4. Numerical Example

This section presents two examples to demonstrate the effectiveness of the proposed algorithm. Specifically, Example 1 describes in detail how the new method works via a certain image file taken from the Pratheepan.FacePhoto dataset [30]. This example also presents the survey results of threshold values, ε1 and ε2. In Example 2, the output binary images, the performance measured by accuracy, detection rate, and false positive rate of the proposed method on the whole Pratheepan.FacePhoto dataset are presented and compared with those of other methods such as Bayesian classifier (BC), linear discriminant analysis (LDA), binary logistics regression (BLR), and Adaptive Neuron Fuzzy Inference System (ANFIS). The detail results are as follows.

4.1. Example 1

To illustrate the proposed method and clarify the effect of the threshold values on classification performance, this subsection performs an experiment on a certain image downloaded from http://cs-chan.com/downloads_skin_dataset.html. We first use different thresholds ε1 to find “true skin” pixels. Figures 4(a) and 4(b) present the original image and the output binary image performed at the posterior threshold of 0.5, respectively. This is also the posterior threshold used by the Bayesian classifier. It can be seen that using this threshold can detect most skin pixels within the face region but incorrectly classifies the “nonskin” pixels located in the hair and background as “skin” pixels, thereby incurring a high false positive rate. As can be observed from Figure 4(c) to Figure 4(f), the false positive rate is reduced when we increase the posterior probability threshold ε1. Using the threshold of 0.997, the false positive rate is very low, with few misclassified pixels occurring in the background. Even though the detection rate is also reduced when the algorithm fails to detect the skin pixels in the nose region, below the eyes, and near the brows, we accept the current result and expect that such skin pixels will be restored later using the connected component algorithm.

Figure 4

The original and the output binary images using different thresholds ε1.

(a) (b) (c) (d) (e) (f)

Let us now consider another illustration in which the U and V channels of the current image are extracted. Figures 5(a) and 5(b) show the skin and nonskin pixels taken from the ground truth and the “skin region” built by the Bayesian classifier with the thresholds of 0.5 and 0.997, respectively. With the posterior probability threshold of 0.5, despite defining a larger skin region which makes Bayesian classifier result in a higher detection rate, the black circle or the skin region built by the Bayesian classifier contains a lot of red points that are nonskin pixels, therefore, increasing the false positive rate. With the posterior probability threshold of 0.997, the black circle or the established skin region is smaller, but virtually all points that fall in this circle are the skin pixels. As a result, the number of false positive pixels is reduced; we accept the skin region established with the posterior probability threshold of 0.997 and enlarge them in the next step using the connected components of skin candidate pixels.

Figure 5

The skin and nonskin pixels and the skin region built by Bayesian classifier with different thresholds ε1.

(a) (b)

In the next step, the threshold ε2 is utilized for finding the “skin candidate” pixels. For the sake of clarity, we first use the fixed threshold, ε2=0.475. Figure 6 illustrates the image after finding the “skin candidate” pixels. We note that a “true skin” pixel identified above is also a “skin candidate” pixel; hence, a pixel that is both the “skin candidate” and “true skin” is represented by the white color, a pixel that is only the “skin candidate” is represented by the gray color, and a nonskin pixel is represented by the black color, for distinguishing purpose. As observed in Figure 6, the false negative pixels, which are in the nose region, below the eyes, and near the brows and were misclassified in the previous step, are now the skin candidate pixels. It can be clearly seen that these skin candidate pixels mostly connect to “true skin” pixels; as a result, they are classified as “skin” via the connected component algorithm. In contrast, most “skin candidate” pixels in the background do not connect to any “true skin” pixels and are classified as “nonskin”.

Figure 6

The “true skin”, “skin candidate”, and nonskin pixels in the image.

The final results are presented in Figure 7. It can be seen that the proposed method, as well as the Bayesian classifier, can well detect the skin in the human face. However, the proposed method removes most of the pixels incorrectly detected by the Bayesian classifier in the hair and background, as shown in Figure 7(b). This is a reasonable output image which reduces the false positive rate and leads to a better accuracy, significantly.

Figure 7

The final results.

(a) (b)

Regarding the problem of thresholds selection, the effects of thresholds on the performance measured by the accuracy, the detection rate, and the false positive rate are investigated on a large number of images given the ground truth. The detailed results obtained for the investigated thresholds are presented in Tables 1 and 2. The results are reasonable since the lower threshold values provide better detection rate and worse false positive rate. Therefore, we consider the accuracy to balance the detection rate and false positive rate. In that case, ε1=0.997 and ε2=0.475 can be considered as the suitable thresholds.

Table 1

The survey of threshold ε1.

Threshold ε1	Accuracy	Detection rate	False positive
0.9	0.8186	0.8181	0.1812

0.99	0.8186	0.8080	0.1777

0.994	0.8196	0.8080	0.1762

0.997	0.8220	0.8027	0.1711

Table 2

The survey of threshold ε2.

Threshold ε2	Accuracy	Detection rate	False positive
0.2	0.8215	0.8137	0.1757

0.25	0.8221	0.8088	0.1732

0.3	0.822	0.8027	0.1711

0.35	0.8222	0.7984	0.1693

0.4	0.8224	0.7941	0.1676

0.425	0.8224	0.792	0.1668

0.45	0.8225	0.7899	0.1659

0.475	0.8227	0.7878	0.1649

0.5	0.8224	0.7845	0.1641

4.2. Example 2

In this section, we examined whether the proposed method improves the classification performance. In particular, the results including accuracy, detection rate, and false detection rate of the proposed method on the whole Pratheepan.FacePhoto dataset are presented and compared with some other methods such as the Bayesian classifier (BC), linear discriminant analysis (LDA), binary logistic regression (BLR), and Adaptive Neuron Fuzzy Inference System (ANFIS). For illustration purpose, some selected original and output binary images of comparative methods are presented in Figure 8. The results of accuracy, detection rate, and false detection rate on the whole image dataset are summarized in Table 3.

Table 3

The performance of comparative methods.

	Accuracy	Detection rate	False positive rate
The proposed method	0.8227	0.7878	0.1649

BC	0.8191	0.7994	0.1739

LDA	0.7507	0.7727	0.2571

BLR	0.7710	0.6738	0.1944

ANFIS	0.7881	0.8740	0.2424

Figure 8

Some original and output binary images of comparative methods.

For the detection rate, the right detection pixels accounted for 78.78% of the true pixels. It can be seen from Table 3 that the proposed method is a competitive method when it ranked third on the list of methods. The best method in terms of detection rate is the ANFIS with the detection rate over 87%. However, there are still a lot of false positive pixels which ANFIS and other methods incorrectly detect, whereas the proposed method can remove the false positive pixels in the background, as observed in Figure 8. The proposed method, therefore, outperforms the others in terms of lower false positive rate and higher accuracy with a rate of approximately 82%.

5. Conclusion

This paper has proposed a new approach to detect skin in color image using the Bayesian classifier and connected component algorithm. The illustrative examples have also been presented in detail. The results have shown that the proposed method is competitive in terms of detection rate and outperforms the others in terms of false positive rate and accuracy. In the future, the proposed method can be further studied for other applications, like face detection, objectionable image detection, etc.

Data Availability

The data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Bianco

Gasparini

Schettini

Computational strategies for skin detection

Computational Color Imaging 2013 199 211

Khaled

Sayed

S. G.

Saad

E. S. M.

Ali

Hand gesture recognition using modified 1$ and background subtraction algorithms

Mathematical Problems in Engineering 2015 1 8

Kim

S.-H.

Lee

H.-S.

Kim

H.-H.

Robust extraction of face candidate through segmentation and conditional merging in skin area

Proceedings of the 2009 IEEE International Conference on Intelligent Computing and Intelligent Systems, ICIS 2009

November 2009

China

547 551

2-s2.0-77949627954

Lin

H.-J.

Wang

S.-Y.

Yen

S.-H.

Kao

Y.-T.

Face detection based on skin color segmentation and neural network

Proceedings of the 2005 International Conference on Neural Networks and Brain Proceedings, ICNNB'05

October 2005

China

1144 1149

2-s2.0-33847113952

Kelly

Donnellan

Molloy

Screening for objectionable images: A review of skin detection techniques

Proceedings of the 2008 International Machine Vision and Image Processing Conference

September 2008

151 158

2-s2.0-57849095791

Hassan

Hilal

A. R.

Basir

Using ga to optimize the explicitly defined skin regions for human skincolor detection

Proceedings of the 30th IEEE Canadian Conference on Electrical and Computer Engineering, CCECE 2017

2017

1 4

2-s2.0-85021836377

Binias

Frąckiewicz

Jaskot

Palus

Pixel classification for skin detection in color images

Advanced Technologies in Practical Applications for National Security 2018 106

Cham, Switzerland

Springer International Publishing

87 99

Cuevas

Zaldivar

Rojas

Fuzzy Segmentation Applied to Face Segmentation 2004

Kakumanu

Makrogiannis

Bourbakis

A survey of skin-color modeling and detection methods

Pattern Recognition 2007 40 3 1106 1122

2-s2.0-33750528717

10.1016/j.patcog.2006.06.010

Zbl1119.68214

Kovac

Peer

Solina

Human skin color clustering for face detection

Computer as a Tool 2003 144 148

Kumar

C. N. R.

Bindu

An efficient skin illumination compensation model for efficient face detection

Proceedings of the IECON 2006 - 32nd Annual Conference on IEEE Industrial Electronics

November 2006

France

3444 3449

2-s2.0-50249126091

Lyon

Vincent

Interactive embedded face recognition

Journal of Object Technology 2009 8 1 32

2-s2.0-77957363738

Prema

Manimegalai

Survey on skin tone detection using color spaces

International Journal of Applied Information Systems 2 18 26

Al-Tairi

Z. H.

Rahma

R. W.

Saripan

M. I.

Sulaiman

P. S.

Skin segmentation using YUV and RGB color spaces

Journal of Information Processing Systems 2014 10 2 283 299

10.3745/JIPS.02.0002

Gomez

Sanchez

Enrique Sucar

On Selecting an Appropriate Colour Space for Skin Detection

Proceedings of the Mexican International Conference on Artificial Intelligence

2002

Berlin, Germany

Springer

69 78

10.1007/3-540-46016-0_8

Zbl1077.68876

Xiang

F. H.

Suandi

S. A.

Fusion of multi color space for human skin region segmentation

International Journal of Information and Electronics Engineering 2013 3 172 174

Ghazali

K. H. B.

Xiao

An innovative face detection based on skin color segmentation

International Journal of Computer Applications 2011 34 6 10

Ghotkar

A. S.

Kharate

G. K.

Hand segmentation techniques to hand gesture recognition for natural human computer interaction

International Journal of Human Computer Interaction 2012 3 15 25

Jusoh

R. M.

Hamzah

Marhaban

Alias

N. M. A.

Skin detection based on thresholding in RGB and hue component

Proceedings of the 2010 IEEE Symposium on Industrial Electronics and Applications (ISIEA)

2010

515 517

Sobottka

Pitas

A novel method for automatic face segmentation, facial feature extraction and tracking

Signal Processing: Image Communication 1998 12 3 263 281

2-s2.0-0032097277

10.1016/S0923-5965(97)00042-8

Yogarajah

Condell

Curran

McKevitt

Cheddad

A dynamic threshold approach for skin tone detection in colour images

International Journal of Biometrics 2012 4 1 38 55

2-s2.0-84857299710

10.1504/IJBM.2012.044291

Friedman

Geiger

Goldszmidt

Bayesian network classifiers

Machine Learning 1997 29 2-3 131 163

10.1023/A:1007465528199

Zbl0892.68077

2-s2.0-0031276011

Jones

M. J.

Rehg

J. M.

Statistical color models with application to skin detection

International Journal of Computer Vision 2002 46 1 81 96

390108

10.1023/A:1013200319198

2-s2.0-0036165170

Osman

Hitam

M. S.

Skin colour classification using linear discriminant analysis and colour mapping co-occurrence matrix

Proceedings of the 2013 International Conference on Computer Applications Technology, ICCAT 2013

January 2013

Tunisia

1 5

2-s2.0-84879874458

Sebe

Cohen

Huang

T. S.

Gevers

Skin detection: A bayesian network approach

Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004

August 2004

903 906

2-s2.0-10044249813

Zaidan

A. A.

Karim

H. A.

Ahmad

N. N.

Alam

G. M.

Zaidan

B. B.

A new hybrid module for skin detector using fuzzy inference system structure and explicit rules

International Journal of Physical Sciences 2010 5 13 2084 2097

2-s2.0-78651075908

Addesso

Capodici

D'Urso

Longo

Maltese

Montone

Restaino

Vivone

Enhancing TIR image resolution via bayesian smoothing for IRRISAT irrigation management project

Remote Sensing for Agriculture, Ecosystems, and Hydrology XV 2013 888710

2-s2.0-84888334148

Castellaro

Rizzo

Tonietto

Veronese

Turkheimer

F. E.

Chappell

M. A.

Bertoldo

A Variational Bayesian inference method for parametric imaging of PET data

NeuroImage 2017 150 136 149

2-s2.0-85013456488

10.1016/j.neuroimage.2017.02.009

Vovan

Classifying by Bayesian Method and Some Applications

Bayesian Inference 2017

InTech

39 61

Tan

W. R.

Chan

C. S.

Yogarajah

Condell

A fusion approach for efficient human skin detection

IEEE Transactions on Industrial Informatics 2012 8 1 138 147

2-s2.0-84856325864

10.1109/TII.2011.2172451

Nguyen-Trang

Vo-Van

A new approach for determining the prior probabilities in the classification problem by Bayesian method

Advances in Data Analysis and Classification. ADAC 2017 11 3 629 643

10.1007/s11634-016-0253-y

MR3688984

Pham-Gia

Turkkan

Vovan

Statistical discrimination analysis using the maximum function

Communications in Statistics—Simulation and Computation® 2008 37

Taylor & Francis

320 336

10.1080/03610910701790475

MR2422890

Shapiro

Haralick

Computer and Robot Vision

Reading 1992

Addison-Wesley