To achieve automatic sorting on commodity trademarks, a binocular vision system has been constructed in this paper. By adjusting camera pose, this system can obtain greater shooting perspective. In order to improve sorting accuracy, a now SGH recognition method is proposed. SGH consists of spatial color histogram (
In the automatic sorting system of commodity, robot executes recognition algorithm to judge the type of trademark according to the image information acquired by vision sensor. To improve the sorting efficiency, two aspects should be improved: the camera sensor can be adjusted flexibly and have large view, and recognition algorithm should have a high accuracy [
For the recognition process, machine vision recognition is often implemented with several steps including image acquisition, image preprocessing, image segmentation, feature extraction, and recognition classification [
In machine vision recognition, color feature is widely used in various algorithms. Zhou and Ruan proposed a recognition approach for the gesture images, which achieves the recognition through constructing and filling the color space of the image of the hand skin and then executing the support vector machine classification [
In this paper, we carry out the recognition on trademarks with the help of a binocular camera sensor. Trademark images have different features. Some have obvious color features, some have rich texture features, and some have significant shape features. Therefore, when designing the trademark recognition algorithm, we fuse the three types of features and use the images taken by the binocular camera sensor to achieve a higher recognition accuracy and efficiency.
To obtain a larger shooting range, we design a binocular camera sensor of which the composition is shown in Figure
The binocular camera sensor.
There are two cameras in this sensor. Two cameras are fixed by the bracket which can rotate and pitch with the movement of three-foot seat. The light source is arranged between two cameras. We can decide whether to use it according to the scene illumination conditions. Because of the existence of bracket spacing, two cameras can obtain a bigger view. Based on these two characteristics, binocular camera sensor can be applied in robot sorting process.
We set up the machine vision recognition system on the basis of the binocular camera sensor, which is shown in Figure
The machine vision recognition system.
First of all, the binocular camera sensor obtains the satisfactory shooting view by adjusting the poses of the two cameras. Then, the captured images are transmitted to the computer via the image capture card. Finally, the computer executes the recognition algorithm to process and analyze the images and at last gives the recognition result.
Images captured by the camera often contain various feature information. Thus, it is usually hard to guarantee the robustness when using one certain feature to realize the recognition. For that reason, we combine the color feature, the texture feature, and the shape feature together to use by the recognition process.
The color histogram is a significant feature to indicate the color information of the images and it is also the most common color feature used in image retrieval. Histograms like the color histogram, the overall histogram, the fuzzy histogram, and the accumulated histogram have been widely used in the existing recognition algorithms. However, these methods only focus on the statistical information of the color and ignore the relation between the color feature and the spatial position of the pixel. The spatial color histogram is a combination of the color and the space information which is adopted in our recognition algorithm.
Assume that the toke variable of the image is
In (
In (
Based on
The gray-level cooccurrence matrix (GLCM) is the most common measurement to indicate the texture feature of the image. In the recognition process, the matrix represents the gray correlation of two images. Besides, the GLCM sometimes needs to be normalized into
When indicating the texture features, the GLCM usually needs 4 kinds of descriptions which are listed as follows.
In formula (
Here,
Altogether, the texture feature can be denoted as follows:
The shape of the image can be seen as an advanced expression of the image visual effect. However, the shape may change if the basic information of the image (such as the size of the image) varies, which hinders its usage in machine vision recognition. For that reason, many feature description operators based on shape invariance have been extracted, such as the inertia moment, the interior angle, the Harris corner, and the Hu moment. In this paper, we use the Hu moment to indicate the shape feature of the image.
Commonly, the Hu moment is obtained based on the region and its calculation is very complex. In order to get the Hu moment with a shorter time, we propose a new calculation approach which is based on the contour. The specific steps of the approach are as follows.
In (
When we get
When applying the recognition system shown in Figure
When carrying out the feature comparison, we compare the color feature, the texture feature, and the shape feature of the captured image with those in the image database and consequently get a similarity-judging formula, which is shown as follows:
In the formula,
In order to verify the proposed recognition algorithm and the corresponding recognition system, we carry out the experiment. For the hardware, the recognition system is based on the binocular camera sensor. The cameras used in our experiment are DH-SV2001FC/FM color industrial cameras manufactured by Daheng (group) Co., Ltd., of which the highest resolution can reach up to 1628 × 1236. Besides, the computer we use is a Dell Inspiron laptop with a dual-core 2.0 GHz CPU, an 8 G RAM, and a 1 TB hard drive. As for the software, we use C++ to write the algorithm and select 1,000 trademark images to establish the image database.
In our experiment, we attach some trademark images on a planar cardboard and position the board vertically to the principal optical axis of the two cameras. Images taken by the two cameras are shown in Figure
Trademark images taken by the binocular camera sensor.
Images taken by the left camera
Images taken by the right camera
As shown in Figure
From these 16 images, we extracted 3 vectors of
Feature in 16 images.
Image | Feature information | ||||||
---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
0.0409 | 0.0387 | 0.0513 | 0.0226 | 0.0188 | 0.0192 | 0.0237 | |
|
|
|
|
|
|
| |
0.0354 | 0.0202 | 0.0216 | 0.0233 | 0.0302 | 0.0219 | 0.0246 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0417 | 0.0392 | 0.0392 | 0.0184 | 0.0207 | 0.0178 | 0.0169 | |
|
|
|
|
|
|
| |
0.0213 | 0.0367 | 0.0109 | 0.0127 | 0.0275 | 0.0284 | 0.0199 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0302 | 0.0317 | 0.0219 | 0.0422 | 0.0439 | 0.0393 | 0.0508 | |
|
|
|
|
|
|
| |
0.0104 | 0.0207 | 0.0322 | 0.0218 | 0.0190 | 0.0242 | 0.0233 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0358 | 0.0409 | 0.0368 | 0.0385 | 0.0417 | 0.0423 | 0.0523 | |
|
|
|
|
|
|
| |
0.0189 | 0.0146 | 0.0233 | 0.0217 | 0.0202 | 0.0255 | 0.0216 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0402 | 0.0315 | 0.0493 | 0.0222 | 0.0184 | 0.0175 | 0.0169 | |
|
|
|
|
|
|
| |
0.0301 | 0.0298 | 0.0254 | 0.0318 | 0.0323 | 0.279 | 0.0233 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0388 | 0.0412 | 0.0545 | 0.0215 | 0.0102 | 0.0117 | 0.0163 | |
|
|
|
|
|
|
| |
0.0165 | 0.0173 | 0.0123 | 0.0144 | 0.0150 | 0.0142 | 0.0182 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0526 | 0.0644 | 0.0397 | 0.0418 | 0.0527 | 0.0396 | 0.0444 | |
|
|
|
|
|
|
| |
0.0202 | 0.0186 | 0.0229 | 0.0213 | 0.0287 | 0.0255 | 0.0212 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0297 | 0.0301 | 0.0254 | 0.0102 | 0.0143 | 0.0176 | 0.0188 | |
|
|
|
|
|
|
| |
0.0182 | 0.0176 | 0.0193 | 0.0204 | 0.0215 | 0.0192 | 0.0133 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0182 | 0.0193 | 0.0145 | 0.0175 | 0.144 | 0.128 | 0.157 | |
|
|
|
|
|
|
| |
0.0201 | 0.0133 | 0.0127 | 0.0126 | 0.0132 | 0.0129 | 0.0125 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0105 | 0.0124 | 0.0098 | 0.0053 | 0.0076 | 0.0021 | 0.0094 | |
|
|
|
|
|
|
| |
0.0100 | 0.0097 | 0.0065 | 0.0063 | 0.0026 | 0.0085 | 0.0017 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0163 | 0.0129 | 0.0188 | 0.0231 | 0.0302 | 0.0256 | 0.0245 | |
|
|
|
|
|
|
| |
0.0205 | 0.0198 | 0.0214 | 0.0238 | 0.0186 | 0.0192 | 0.0203 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0503 | 0.0489 | 0.0517 | 0.0323 | 0.0306 | 0.0279 | 0.0534 | |
|
|
|
|
|
|
| |
0.0192 | 0.0165 | 0.0177 | 0.0201 | 0.0162 | 0.0183 | 0.0174 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0488 | 0.0476 | 0.0503 | 0.0431 | 0.0397 | 0.0296 | 0.0355 | |
|
|
|
|
|
|
| |
0.0282 | 0.0219 | 0.0284 | 0.0192 | 0.0235 | 0.0183 | 0.0192 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0281 | 0.0277 | 0.0319 | 0.0454 | 0.0279 | 0.0321 | 0.0308 | |
|
|
|
|
|
|
| |
0.0182 | 0.0194 | 0.0201 | 0.0176 | 0.0183 | 0.0121 | 0.0175 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0124 | 0.0153 | 0.0103 | 0.0188 | 0.0169 | 0.0170 | 0.0144 | |
|
|
|
|
|
|
| |
0.0358 | 0.0245 | 0.0327 | 0.0316 | 0.0355 | 0.0284 | 0.0285 | |
|
|||||||
|
|
|
|
|
|
|
|
0.0186 | 0.0190 | 0.0174 | 0.0126 | 0.0131 | 0.0154 | 0.0189 | |
|
|
|
|
|
|
| |
0.0193 | 0.0201 | 0.0185 | 0.0176 | 0.0227 | 0.0301 | 0.0743 |
Visual recognition platform is shown in Figure
Interface of vision recognition platform.
In order to verify the effectiveness of the proposed multifeature fusion recognition algorithm, we compare it with recognitions based only on color feature, texture feature, and shape feature, respectively. Recognition accuracy is shown as follows:
In formula (
Recognition accuracy results of four kinds of methods are shown in Table
The accuracies of the four recognition methods.
Color recognition | Texture recognition | Shape recognition | Fusion recognition | |
---|---|---|---|---|
Recognizing 1 image | 100% | 100% | 100% | 100% |
Recognizing 2 images | 100% | 100% | 100% | 100% |
Recognizing 4 images | 100% | 100% | 75% | 100% |
Recognizing 8 images | 87.5% | 87.5% | 50% | 100% |
Recognizing 10 images | 90% | 80% | 60% | 100% |
Recognizing 12 images | 83.3% | 75% | 58.3% | 91.7% |
Recognizing 14 images | 78.5% | 71.4% | 57.1% | 92.8% |
Recognizing 16 images | 81.3% | 75% | 56.3% | 87.5% |
From Table
The reason for the low accuracy of the shape-feature based recognition is because shapes are simple and sometimes resemble others in the trademarks that we use for the experiment. Likewise, the reason for the relative high accuracy of the color-feature based recognition is that the trademarks contain very rich color information, which offers more references for the similarity judgment.
As for SGH recognition algorithm that we propose, the vital reason for its higher recognition accuracy is that it utilizes the three features simultaneously, which provides more effective information for the judgment.
We also find that recognition accuracy will decrease as the number of images increases by four kinds of algorithm. It is because similarity judgment will decrease with a large number of images.
Further, we compare the time consumption for the four recognition methods, which is shown in Figure
From the curves in Figure
Time consumptions of the 4 recognition methods.
In this work, a new vision sensor with greater view is designed to apply on the sorting system. Based on this sensor, we establish a machine vision recognition system and design a novel SGH recognition algorithm.
In SGH algorithm, multiple features are utilized which includes color feature, texture feature, and the shape feature. These three types of features are indicated by spatial color histogram, gray-level cooccurrence matrix, and Hu moment, respectively. SGH is superior to traditional single feature based recognition algorithms which is limited by the partial representation power of single feature. Experimental results show that the accuracy of the proposed algorithm is obviously higher than those of the recognition methods merely based on the color feature, the texture feature, and the shape feature.
With respect to time consuming, SGH algorithm requires more time compared to single feature based algorithms, but this time weakness can be fully compensated by the superior recognition performance of SGH. Moreover, when the number of images increases, the time consumption of SGH presents an obvious decline tendency. This indicates that, in the large scale system, the efficiency of SGH approaches to the traditional methods. In the future, we will focus on how to further reduce the executing time and enhance the flexibility of our algorithm.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This study was supported by Heilongjiang Province ordinary college training program for New Century Excellent Talents with Grant no. 1254-NCET-008.