A Color Distance Model Based on Visual Recognition

,


Introduction
Nowadays, computer vision has achieved great progress and gives people many useful technologies, such as image segmentation, image retrieval, object tracking, and video surveillance.In many applications, illumination change happens frequently as shown in Figure 1 (selected from the Berkeley segment dataset [1] and the CUHK01 Pedestrian Dataset [2]).However, illumination change is still a difficulty which is not well resolved [2][3][4].
Generally, color distances are computed by Euclidean Distance in RGB or CIELAB [5][6][7][8].In order to evaluate its performance, a simple test is carried out.First, we generate four chromatic color pairs and six gray colors as shown in Figure 2. Every chromatic color pair that has ℎ 1 = ℎ 2 ,  1 = 0.7, V 1 = 0.4,  2 = 1, and V 2 = 0.7 in HSV space are assumed to be coming from the same object under a moderate illumination change.Then the distances between each chromatic color and all the other thirteen colors are calculated by Euclidean Distances in RGB and CIELAB, respectively, that is,  RGB and  LAB .For each chromatic color the corresponding thirteen distances are sorted in ascending order.Figure 2 shows the front seven distances corresponding to the dark orange color and the yellow color, respectively.We observe that the distance between the colors of the yellow or orange pair is not the shortest one or even larger than six irrelevant color distances.Therefore, we think that Euclidean Distance cannot work well under illumination change.In addition, the results of the proposed color distance model based on visual recognition (VR)  VR are given for comparison.Obviously,  VR measures the distances between these colors effectively.
A few researchers proposed different distance measure methods, aiming at their specific goals, respectively.Since the nice perceptual uniformity of LAB space remains in effect only within a radius of a few just-noticeable differences, Sharma et al. [9] devised a very sophisticated metric CIEDE2000, which provides a closer fit to perceptual judgments.Mojsilović [10] proposed a complicated distance metric for naming input colors by eye.In order to achieve shading invariance, Wesolkowski and Fieguth [11]   vector angle based distance measure in RGB, which decreases the weights of shaded colors.However, the distance between two pixels becomes smaller when either one is shaded.Vertan et al. [12] studied a parallel coordinate distance in RGB for edge detection, in which the differences of the three channels are used.Lee and Plataniotis [13] combined a hue direction comparison term with a chroma comparison term in LAB for computing image difference.Their comparison terms are computed by using the product and the quadratic sum of two hue or chroma values.For these patch or superpixel-based methods [14,15], the color distance between two patches can be measured by using their histogram distance.Given a pixel corresponding to a bin, if the illumination changes, usually it will vote for another bin.Therefore color distance is strongly influenced by the illumination.To obtain illumination invariance, many color descriptors are proposed, such as the opponent histograms [16], the color moment invariants [16], and the shape context based-color signatures [17].The color name method [18], which assigns a color to a language term, also displays a certain amount of photometric invariance.However, these descriptors and color names are utilized for feature description or image recognition and are not suitable to measure color distance.On the other hand, the state-of-the-art LOMO feature [15] applies the multiscale Retinex algorithm to preprocess pedestrian images before extracting color histograms.The Retinex algorithm processing both the color constancy and dynamic range compression automatically is useful to handle illumination variations to some extent.
Since human eyes can simply recognize similar or irrelevant colors under illumination change, a novel color distance model based on visual recognition is proposed.First, we find that various colors are distributed complexly in color spaces.Thus, a single model cannot achieve a good color distance.Therefore, we propose to divide the HSV space into three less complex subspaces.Three specific distance models need to be studied.In addition, the principles of our visual color distance are introduced, while the related works [11][12][13] are proposed without explicit principles.Then a novel hue distance is modeled and trained based on visual recognition.And the chromatic distance model is proposed in line with the introduced principles.Finally, the gray distance model and the dark distance model are studied according to the natures of their subspaces, respectively.Experimental results show that the proposed model outperforms Euclidean Distance and the related methods and achieves a good distance measure against illumination change.In addition, our  VR obtains good performance for matching patches of pedestrian images under illumination variations.

The Proposed Method
If every color belongs to the same category, it would be an easy task to measure the similarities or distances between colors.However, colors are visually perceived as various categories.The primary color categories include white, gray, black, red, orange, yellow, green, cyan, blue, and purple.Furthermore, in the color space most colors are distributed between two or more primary colors, such as yellowish green, dark red, light cyan, and gray-blue.On the other hand, colors are represented with only three values.Therefore we consider that various colors are distributed complexly in a threedimensional color space.Thus, it should be difficult to recognize arbitrary two colors as similar or irrelevant by linear distance models and three color channels.Consequently, Euclidean Distances in RGB or CIELAB cannot measure color distance correctly as shown in Figure 2.
As everyone knows, human eyes can recognize similar colors and discriminate various colors under illumination change.This kind of ability can be helpful to many computer vision technologies, such as object segmentation, visual tracking, and pedestrian reidentification.We propose to study a color distance model based on visual recognition.Thus we begin with a definition of nonmetric visual color distance.The visual distance ( 1 ,  2 ) between two colors  1 and  2 should satisfy the following principles as well as possible: (1) If  1 and  2 look more similar than  1 and  3 , ( 1 ,  2 ) should be less than ( 1 ,  3 ).
(2) If  1 and  2 look completely different, ( 1 ,  2 ) should be equal to the maximum  max or larger than  max .
(3) If  1 and  2 belonging to the same object are under a moderate or strong illumination change, ( 1 ,  2 ) should be less than  max .
In HSV space, the hue ℎ describes various chromatic colors.The value V = max(, , ) defines the intensity of a color.And the saturation  = 1 − min(, , )/V defines the chromatic degree of a color.Since the HSV space is intuitive to humans and available for color description, arbitrary two colors can be recognized as similar or irrelevant by using hue, saturation, and value.We prefer to study our human visual recognition based-color distance model in the HSV space.The primary color categories can be grouped into chromatic ones and achromatic ones.If the illumination is heavily decreased, many chromatic colors may appear black to some extent.For an undertint color with low , it may resemble a gray or a white color.
To reduce the complexity and achieve a desirable distance model, the HSV space is simply divided into three overlapped subspaces according to the above analyses as shown in Figure 3.For each subspace, a specific color distance model needs to be studied.The chromatic subspace is composed of general colors, undertint colors, and dusky colors.If the hue values of some colors in this subspace are close enough, they will look similar, or may belong to the same object.For this subspace, the key ability of its model is to discriminate various hues.
On the other two sides, the gray subspace and the dark subspace focus on near-gray colors and near-dark colors, respectively.Each subspace includes a focused region and an adjacent region which overlaps the chromatic subspace.Apparently, colors in an adjacent region may look similar to colors in the corresponding focused region.The distance model of each subspace is expected to be able to compute the distance between colors from its focused region and the adjacent one, respectively.

The Chromatic Distance Model
In the chromatic subspace, every color is visually recognized with its hue, saturation, and value.Given two colors  1 and  2 , the differences of their features, that is, Δℎ, Δ, and ΔV, are natural distance measures.However, Euclidean Distance, which combines these differences, does not perform well for the above test (see Figure 2).Obviously, saturation and value are simple features for discriminating against white, gray, or dark colors and colors with different saturations.Hue is the key feature for recognizing various chromatic colors.We notice that the maximum of Δℎ = min(|ℎ ) is 0.5, while there are seven primary hues and many mixed hues.Consequently, Δℎ cannot effectively tell us that two hues are similar or different.
As human can recognize various hues, we propose to train a novel hue distance  VR (Δℎ) based on visual recognition, which is a regression task.First,  max is set to 1, because we consider that two colors with Δ = 1 or ΔV = 1 should be recognized as completely different.About seven hundred hue pairs with Δℎ < 0.15 are generated by randomly sampling in the whole hue range.Since it is a difficult job to give a distance value to a hue pair, a rough distance value   (i.e., 0.2, 0.4, 0.6, 0.8, 1.1, or 1.4) is obtained by eye for each hue pair as shown in Figure 4.These default distance values correspond to similar hue pairs (  < 1,   = 1) or irrelevant hue pairs (  > 1,   = −1).
To achieve good hue recognition performance, a logistic function, which is utilized to model the activation of a neuron in neural network, is adopted to model the visual perception of hue distance The parameter  is set to an appropriate constant due to  max = 1.Thus  can be determined by enforcing  VR (0) = 0.In the whole hue range, the primary hues are not distributed linearly with respect to human perception.Thus, a hinge loss function is utilized to learn the parameters  0 and  1 by gradient descent.
Then the color distance can be calculated as However, ( 1 ,  2 ) does not keep in line with the third principle, because usually ΔV becomes large when the illumination change is heavy, and Δ is also increased to some extent.
In fact, irrelevant colors can be discriminated by the most discrepant  VR (Δℎ) or Δ.The color distance is improved as follows: Consequently,   VR ( 1 ,  2 ) can give us an effective color distance under illumination change in the chromatic subspace.

The Gray Distance Model and the Dark Distance Model
In the other two subspaces, achromatic categories, that is, white, gray, or black, are added.Their complexities are much larger than the pure chromatic subspace's.Therefore, it is more difficult to measure color distances in these two subspaces.For example, the distances between the dark orange and some grays are lower than the orange pair's distance in Figure 2.
Obviously, a gray or a white only look similar to colors with  <   , where   is a low saturation.Then the gray color distance with Δ for all the colors in the gray subspace.
The key of this distance model is how to compute an appropriate gray weight   .For a chromatic color, its saturation may vary under illumination change, and   VR ( 1 ,  2 ) is effective.While the saturation of a gray-like color is limited within a very small and low range.Therefore,   is calculated with the lower saturation of two colors.
where   is the parameter to define the range of the gray-like colors.
For a black, the chromatic and saturation information is lost, and its ℎ and  will be influenced by the noises.Thus Δℎ and Δ should be neglected when measuring the distance between a black color and other colors.In the dark subspace, the color distance is modeled as where  V is set to V of an appropriate dark color.Like the above   , the dark weight   is defined in the same manner where  V is the parameter to define the range of the dark colors.

Experimental Results
The related methods [11][12][13] are evaluated for their specific goals instead of color distance measure, such as image segmentation, edge detection, or computing image difference.Obviously, it is difficult and impossible to give appropriate distance values for most color pairs.Hence, we carry out two experiments by comparing the distances of the related color pairs with the distances between various irrelevant colors.In addition, we test our model by matching patches of pedestrian images.In our model, the parameter  can be in the range [1.7, 2.5], as  max = 1 and  VR (Δℎ) should not be much larger than  max for   VR or   VR .Evidently, the other four parameters, that is,   ,   ,  V , and  V , should be in the range [0.1, 0.25].In the following experiments, ,   ,   ,  V , and  V are simply set to 2, 0.15, 0.2, 0.15, and 0.15, respectively.

Tests for the Chromatic Distance Model.
To evaluate the performance of the chromatic distance model, enough testing colors covering fundamental illumination variations are needed.Usually, the value difference ΔV of an object under illumination change may be significant, while the hue difference Δℎ is small and limited.In Figure 1, Δℎ of the five patch pairs are .02,.02,.03,.01,and .01,respectively.Since the illumination change cases of these patch pairs are usual for many computer vision tasks, ten tests of color query are conducted with the (, V) couples of these patch pairs, respectively.
For a test, twelve query colors are generated by using (  , V  ) of a patch  and twelve hues sampled uniformly in [0, 1] at intervals of 1/12.In addition, for each query color, three related colors are generated by using three hue differences (i.e., Δℎ = −.02,0, and .02)and (  , V  ) of the related patch  which belongs to the same object as the patch .As a result, a query color and its three related colors can be regarded as coming from the same object under illumination variations, while the other 44 colors are different and irrelevant to the query color.
The proposed model  VR , Euclidean Distances in three color spaces (i.e.,  RGB ,  LAB , and  HSV ), and the distance measures [11][12][13] are evaluated by comparing the distances between each query color and all the other 47 colors of a test.The three related color distances between a query color and its related colors should be shorter than all the 44 irrelevant color distances between the query color and the irrelevant colors.Thus a query error is defined as the number of the irrelevant color distances shorter than a related color distance.Finally, the average query error of all the related colors is employed to evaluate the performance of a test.
The average query errors of these ten tests are given in Table 1.Most distance measures except the measure [11] and  VR perform badly to different extents.The main reason could be that these distance measures cannot deal with the complexity of their color spaces.The measure [11] based on vector angle in RGB achieves only a few zero query errors.Because a vector angle is dependent on both its saturation and hue, the measure [11] obtains good results when the saturation of the query color is high enough and Δ is low enough.Obviously, our  VR achieves zero query errors under different illumination variations.In  VR , the distances of  hue, saturation, and value are all considered properly.And the distance of hue is studied in line with the principles.Therefore the proposed  VR implements a good distance measure under various color variations.

Tests for the Gray Distance Model.
Five color query tests are carried out with the five color pairs shown in Figure 5, which are gray or undertint color pairs.Thirty hues, ten saturations, and fifteen values are uniformly sampled in the ranges [0, 1], [0.2, 0.38], and [0.2, 0.9] at intervals of 0.033, 0.02, and 0.05, respectively.Then totally 4500 chromatic colors generated with these sampled data are assumed to be dissimilar to all the ten testing colors (  , V  ) in Figure 5.
In each color pair, the color with the smaller saturation is used as a query color.Its distances from its related color and the 4500 chromatic colors are calculated by the proposed model  VR , the above Euclidean Distances, and the distance measures [11][12][13], respectively.Then the number of the chromatic color distances smaller than the related color distance is employed as a query error and recorded in Table 2. Since most of the chromatic colors are really irrelevant to each query color, a very small number relative to 4500 can indicate the correctness of a distance measure.In Table 2, the results show that it is difficult for the Euclidean Distances to measure the color distance correctly in the gray subspace.
Δℎ of the five pairs are .01,.18,.28,.11,and .24,respectively.As for the "Pink" and "Leg" pairs, their Δℎ are large, and  their chromatic information, that is, min( 1 ,  2 ), is relatively nontrivial.Thus, our results of the two pairs are reasonably correct.Though there are four pairs with large Δℎ, the measure [11] and  VR achieve good performance.Therefore, we consider that our  VR can be applied to measure distance under illumination variations in the gray subspace.

Matching Patches for Pedestrian Reidentification.
As shown in Figure 6, 128 pairs of 9 × 9 patches uniform in color are cropped from 128 pairs of pedestrian images from the CUHK01 Dataset [2].Many pairs are under strong illumination change, while more pairs are under moderate illumination variations.We give a hue label to each pair.The labels include pink, red, orange, yellow, green, cyan, and blue.For every query patch , it is matched with all the other 255 patches by calculating their color distances by many color histogram distances and our  VR , respectively.These distances are sorted in ascending order.Then a threshold is used to exclude patches dissimilar to patch  and reserve patches with the same label of patch .In each patch list in  Figure 6, a query patch highlighted by a dark yellow hoop is shown at the left.All the reserved patches are listed after the query patch.The related patch of a query patch is highlighted by a yellow hoop.A patch with a hue label different to the query patch's is viewed as a wrong match highlighted by a red hoop.The recall is defined as the rate that the related patch of every query patch  is reserved.The error rate is computed from all the matching lists.
Our  VR , LOMO [15], and other color histograms used by recent pedestrian reidentification works [19,20] are evaluated.LOMO is one of the state-of-the-art features and adopted by many current works [21,22].Although it extracts color histograms and texture features and does a postprocess to address viewpoint changes of pedestrians, only a HSV 8×8×8 histogram is extracted as the color feature for a single patch.
In our experiments, we found that for every color histogram the Retinex preprocess of LOMO [15] improves its matching performance.Thus, the results of the histograms named with a (R) or (LOMO), for which the Retinex preprocess is applied, are given in Table 3.In other words, only HSV 8×8×8 and our  VR are tested without the Retinex preprocess.
For a histogram, the input data is discretized into one of 8 or 512 bins.If the illumination changes, colors usually turn to other bins.Thus, the related HSV (LOMO) distance of a patch pair may be very large or even equal to the maximum value 2 as shown in Figure 7. On the other hand, colors with different primary hues may vote to the same bin, which will lead to matching mistakes.For example, in Figure 6, the two HSV (LOMO) lists include two red patches and three cyan ones, respectively.Although the illumination variations are handled by the Retinex preprocess, the best recall of these histograms is 66% at the error rate of 34%.Therefore, color histograms are not good at measuring color distance and sensitive to illumination change.
Obviously, our  VR achieves desirable performance on this test.In Figure 7, the related  VR distances, which are all lower than 0.7, indicate that the distances of the three channels are effectively combined, and  VR keeps in line with the third principle.Since the hue labels given to the testing patch pairs are discretized outputs, a few matching mistakes will be inevitable.Consequently, we consider that  VR can be an effective color distance measure for pedestrian reidentification.

Conclusion
In this paper, a color distance model based on visual recognition is proposed.To reduce its complexity, the HSV space is divided into three subspaces.Then a novel hue distance is modeled based on visual recognition, and the chromatic distance model is studied in line with the principles.Finally, the gray distance model and the dark distance model are proposed according to their natures, respectively.Experimental results show that the proposed model outperforms Euclidean Distance and the related methods and achieves a good distance measure against illumination change.Therefore, the proposed model can be applied to image segmentation, color based detection, and image retrieval.In addition, our  VR obtains good performance for matching patches of pedestrian images.As many patches or superpixels consist of two or three kinds of colors, an effective color clustering method is needed to be studied for future work, so that the proposed model can be applied to pedestrian reidentification, visual tracking, and other patch or superpixel-based tasks.

Figure 2 :
Figure 2: Illustration of the front color distances.A value in blue is the distance of its corresponding color pair.

Figure 3 :
Figure 3: Illustration of the specific subspaces.

Figure 6 :
Figure 6: Two patch pairs and their matching results.

Figure 7 :
Figure 7: The distributions of the related color distances of the 128 patch pairs.
proposed a Figure 1: Illustration of objects under illumination variations.The mean HSV saturations and values of the patches are given.

Table 1 :
The testing results corresponding to Figure1.

Table 2 :
The testing results corresponding to Figure5.

Table 3 :
The results of the matching tests.