Depth Measurement Based on Infrared Coded Structured Light

Depth measurement is a challenging problem in computer vision research. In this study, we first design a new grid pattern and develop a sequence coding and decoding algorithm to process the pattern. Second, we propose a linear fitting algorithm to derive the linear relationship between the object depth and pixel shift. Third, we obtain depth information on an object based on this linear relationship. Moreover, 3D reconstruction is implemented based on Delaunay triangulation algorithm. Finally, we utilize the regularity of the error curves to correct the system errors and improve the measurement accuracy. The experimental results show that the accuracy of depth measurement is related to the step length of moving object.


Introduction
Obtaining depth information from the scene is one of the most crucial problems in the computer vision studies.When the depth information is obtained, the resulting data can be applied in various fields, such as 3D reconstruction, remote sensing, vision measurement, and industrial automation [1][2][3].Traditional depth measurement methods mainly include time-of-flight, stereo vision, and structured light [4].
TOF is a method with high accuracy in depth measurement.TOF cameras emit near infrared light that has been modulated to illuminate an object [5].The reflected light is collected by a CCD sensor, and the depth information can be obtained by comparing the phase shift between the modulated light and its reflection.The major disadvantages of this method are high cost and complexity of the devices.
Stereo vision simulates the human vision by using two or more cameras to capture 2D images from different angles [6].These images are called disparity images.The depth information can be calculated based on the disparity images.However, this method involves complicated calculation which causes slow measuring speed.
The structured light method has gained increasing attention due to its high speed and depth measurement accuracy [7].Compared with the stereo vision method, the structured light replaces one of the cameras with a projector.The projector emits some types of light, which include single-point pattern, single-line pattern, and coded pattern, to the objects.The pattern reflection is captured by the camera.Depth information can be calculated based on the triangulation.
Visible light interference and projector calibration are two key issues in the structured light method.In this method, the projectors are difficult to calibrate because they cannot capture the image actively [8].Some studies considered the projector as an inverse camera [9], while others focused on estimating the equation of light-stripes planes emitted by the projector [10].However both issues depend on determining the camera parameters.Therefore, the main drawback of the structured light method is the coupling of errors in projector and camera calibration.
To address these problems, we first use infrared structured light as active light source because the wavelength of infrared light is different from that of visible light.Second, we design a simple and efficient structured light pattern as well as the corresponding coding and decoding algorithm.Third, we take the camera and laser emitter as one, and then we use the linear fitting algorithm to determine the system parameters rather than calibrate the camera and the projector, respectively.

Related Work
According to the different types of projected light patterns, the structured light methods can be classified into single-point pattern, single-line pattern, and coded structured light.The single-point pattern method obtains depth information by point-scanning the entire image.Thus, the computational complexity increases dramatically with the increase in the size of measured object.This method demonstrates the practicability of depth measurement based on structured light method.
Compared with the single-point pattern structured light method, the single-line pattern structured light method can obtain depth information only by scanning one-dimensional object [11].Therefore, the processing time and computational complexity are reduced significantly.However, the single-line method requires scanning the object many times to obtain the depth information.
The coded pattern structured light method has been proposed to reduce measurement time.This method can obtain the depth data in one shot.Moreover, the method has been studied extensively in recent years because of its high accuracy.Cheng et al. proposed an arrangement coding method [12] using six colors that are arranged to produce a projection pattern.However, this method projects a colored light stripe to the object, which easily fades to the scene.Thus, the method is applicable only to objects which have neutral colors and has low robustness.
Kawasaki et al. proposed a grid pattern to obtain the depth information of the object [13].The grid pattern is used in distinctive colors on horizontal and vertical directions.This method utilizes the peak detection to determine the intersection.The decoding algorithm only needs to identify the vertical and horizontal curves.However, this method is time consuming and not sensitive to the texture of the object surface.
Koninckx and van Gool [14] proposed an adaptive depth measurement method based on the epipolar constraint method.The pattern can be adjusted automatically based on the noise and color of the object.The disadvantage of this method is the effect of scene noise on the pattern.
Compared with the aforementioned methods, ours is described as follows.First, we design a new grid pattern and propose a sequence coding and decoding algorithm for the pattern.Then, we propose a linear fitting algorithm of system parameters to construct the linear relationship between the object depth and the pixel shift.Finally, the depth information of the object is obtained based on the linear relationship.Moreover, 3D reconstruction is conducted based on Delaunay triangulation algorithm [15].

Principle of Imaging System
The external appearance of the camera system and hardware structure is illustrated in Figures 1 and 2. The system consists of an infrared laser emitter, a diffraction grating, and a CCD camera.The resolution of the CCD camera is 640 * 480.The laser diode emits a single beam and splits into multiple beams by the diffraction grating.The CCD camera captures the pattern and correlates it against a reference pattern.The reference pattern is located on a plane of known depth and is stored in the computer memory.When the pattern is projected on an object whose distance to the sensor differs from the reference plane, the position of the speckle shifts on the baseline between the laser and the camera.These shifts are measured for each speckle by a simple procedure, which can calculate a depth image by using the equation described as follows.
Figure 3 illustrates the relation between the distance of an object point  to the sensor relative to a reference plane and the measured disparity   .To express the 3D coordinates of the object points, the paper considers a depth coordinate system with its origin at the perspective center of the infrared camera.The -axis is orthogonal to the image plane towards the object, the -axis is perpendicular to the -axis in the direction of the baseline  between the infrared camera center and the laser projector, and the -axis is orthogonal to  and  making a right handed coordinate system.The reference plane is perpendicular to the -axis, and the object plane must appear on the right side of the reference plane.We assume that  is on the reference plane whose distance to the camera is  0 .The speckles on the object are captured by the CCD camera.The location of the speckle on the image plane shifts in the direction of baseline once the object is moved close to the camera.This shift is illustrated by the pixel offset in the image space.
Given the similarity of triangles, we have where  represents the depth of the point  on the object,  is the focal length of the CCD camera,  is the length of baseline,   is the displacement of point  in object space, and  is the pixel offset in the image space.Substituting   from ( 2) into (1), we have Equation ( 3) is the basic mathematical model for derivation of depth from the pixel offset in the image space. 0 , , and  are the system parameters that can be determined by the approach mentioned in the next section.

Depth Measurement
The algorithm flow of depth measurement is illustrated in Figure 4.This algorithm flow includes calculation of pixel offset, linear fitting of system parameters, depth measurement, and 3D reconstruction.

Calculation of Pixel Offset.
The light intensity distribution of the speckles in the raw image is uneven, and the image is dark with poor contrast.To solve these problems, we utilize the image preprocessing algorithm including image denoising, enhancement, and binarization.
Once the image binarization is implemented, we can obtain the centroids of the speckles as follows: where   is the ordinate of all the pixels in one speckle, and  is the centroid ordinate.We developed a sequence coding algorithm that can help us find the corresponding points between the object image and the reference image.The steps of the algorithm are as follows.
Step 1.The centroids are sorted according to the values of the abscissa of the centroid.Then the centroids that have the same value of abscissa are replaced in the same column.
Step 2. The centroids are sorted in each column based on the ordinate values.
Step 3. The reference and object image are encoded according to the sequence from Steps 1 and 2, respectively.All the speckles are sorted from left to right and from top to bottom.
Step 4. The reference image and object image are decoded.Each corresponding point between them has the same ID.
Then the pixel offset between the object and reference image can be obtained.Figure 5 shows the method of encoding.

Linear Fitting of System Parameters.
As mentioned, three system parameters should be determined.We convert (3) to a linear relationship which is showed as follows: where  1 and  2 are the parameters to be determined.According to the depth and pixel offset obtained in the experiment and the weights determined by the measurement times, the fitting results of undetermined parameters in (5) are where   and   represent the pixel offset and object depth, respectively.  represents the weight determined by the measurement times.The object plane of the system is not perpendicular to the optical axis.Therefore, the changes in magnification occur with the changes in object distance and system errors.we can calculate the object depth based on (5).Then we can obtain the 3D coordinate of each of the laser speckles on the object image.The 3D scene can be reconstructed based on Delaunay triangularization [15].
Delaunay triangularization uses the characteristics of point cloud in triangularization.First, a convex hull that includes all discrete data points is generated.Then, the convex hull is used to generate an initial triangle.Other discrete points are added one by one to generate the final triangle.The 3D reconstruction results are shown in Figure 12.

Experimental Design.
Using a KT&C CCD camera and a 50 mW infrared laser emitter, we conducted experiments on 10 groups of data.First, we captured a reference image.Then, the object was moved at step length Δ = 1 mm, and the pixel offset of each move was recorded at step length Δ.The object was moved 20 times.The procedure was repeated for the 10 groups of data.The first five groups were used as training data in linear fitting to determine the parameters.Three groups were used as corrected data to correct the result of training data.The last two groups of data were used to test the accuracy of our method.

Error Evaluation and Correction.
We calculated the average value of the first 5 groups of the measurement data to eliminate random errors.The parameters  1 = 0.0157,  2 = 0.4067 could be calculated based on the average value and according to ( 6) and (7).
In Figure 6, the blue crosses describe the relation between  and .The red straight line describes the standard linear relation between them.This line demonstrates that when the object depth is small,  and  have a good linear relation.As the object depth increases, the relation between these parameters gradually becomes nonlinear.The nonlinear rules indicate that the amplification in a different position of the camera lens is different.The object plane of the system is not perpendicular to the optical axis.Therefore, the amplification changes with the change of object distance  and system errors (such as lens distortion).Equation ( 5) illustrates the influence only of the former on the magnification and does not consider the role of the latter.Therefore, the changes caused by system errors should be corrected as described in the following paragraphs.Equation ( 5) can be rewritten as follows: Then we obtain the relation between  and .Therefore, object depth  can be determined by .Substituting the other three groups of data into (7) to calculate the depth, we determine the error Δ =  −   , where   is the standard depth.The errors of the correction data are illustrated in Figure 7.
Errors are classified into random and system errors.Random errors have no regularity and can be reduced by increasing the measurement times in the experiments.In Figure 7, the errors of the three groups have a similar shape.Therefore, Δ as the system errors exists.We employ the regularity of the error curves to correct the results.The approach is described as follows.
Step 1.The standard error of the mean is calculated and the error fitting curve is obtained.
Step 2. , which is calculated by (7), is substituted into error curve to obtain Δ.Step 3. Δ is subtracted from .Then, we obtain the actual depth.
The error curve by linear fitting is presented in Figure 8.The error curve shows the errors of the results that have been corrected by Δ.As shown in Figure 9, the errors have no regularity; thus the system errors have been eliminated.

Accuracy Analysis.
We use the last two groups of data to test the accuracy of the system.The test result of group 1 is shown in Table 1.We can determine from Table 1 that the maximum error in group 1 is 0.2177 mm and the average error is 0.1194 mm.According to our experiment, the accuracy of the system is related to the step length.
Therefore, using only the absolute error is insufficient to show the degree of accuracy of the system.We have to consider the step length of the system to illustrate the accuracy objectively.Therefore, we propose the following indicator: where  is the absolute error and  increases with the reducing of Δ when  is constant.The units of  and Δ are both millimeter.In our system,  is 0.1194.in Figure 10(b).According to Figure 10, the sequence coding algorithm is used, as mentioned in the previous section, to obtain the pixel offset between corresponding points of reference and object images.
Figure 11 shows the color map of the depth image.The color presents the depth on a particular point.The change from blue to red represents the increasing object depth.Based on these data, the 3D scene can be reconstructed.The results of the 3D reconstruction are illustrated in Figure 12.The reconstructed 3D scene is viewed from two directions.

Conclusion
This paper has described an effective approach to measure the object depth of the surrounding scene in real time.We have designed a grid pattern generated by laser diffraction.This grid pattern performs better than other structured light patterns.In addition, the grid pattern can be calculated easily to determine the corresponding points between the reference and object images.Given that the imaging system calibration has been replaced, we have proposed a linear fitting algorithm.For the generated depth point cloud, we have reconstructed the 3D scene with the Delaunay triangularization algorithm.Random errors can be reduced by increasing the measurement time because the random errors have no regularity.Moreover, we have used the regularity of the error curves to correct the system errors and improve the measurement accuracy.The experimental results show that the depth measurement accuracy is related to the step length of the moving object.The measurement error  is 0.12 under the condition of Δ = 1 mm.If the depth variance of the object is obvious, the measurement error increase because of the limits of the size and the point distances of the designed gird pattern.To solve these problems, we intend to develop an omnidirectional structured light system on a mobile platform in the future.

Figure 1 :
Figure 1: The laser emits a point matrix pattern projected onto the scene.The CCD camera captures the pattern and correlates it against a reference pattern.

Figure 2 :
Figure 2: The hardware structure of the system.

Figure 3 :
Figure 3: Schematic representation of pixel offset relation. represents the depth of the point  on the object,  is the focal length of the CCD camera,   is the displacement of point  in object space, and  is the pixel offset in the image space.

Figure 4 :
Figure 4: Algorithm flow of depth measurement.

Figure 5 :
Figure 5: The method of encoding.All the speckles are sorted from left to right and from top to bottom.Then each speckle is given an ID.

4. 3 .Figure 6 :
Figure 6: The result of linear fitting.The blue crosses describe the relation between  and .The red straight line describes the standard linear relation between them.This line demonstrates that when the object depth is small,  and  have a good linear relation.As the object depth increases, the relation between them gradually becomes nonlinear.

3 Figure 7 :
Figure7: The errors of the three groups of correction data.The three error curves are similar in shape.Therefore, system errors exist in addition to random errors.

Figure 8 :
Figure 8: The fitting line of errors.

1
Error of group 2 Error of group 3

Figure 9 :
Figure 9: The errors are rectified by the error curve.The curves in the figure have no regularity and the errors are reduced.

Figure 10 :Figure 11 :
Figure 10: The raw image and binary image.(a) Raw image of object, (b) binary image, which has been preprocessed.

5. 4 .
Experiment Results.The experiment results are presented in Figures10 to 12.The raw image is shown in Figure10(a), and the binary image that has been preprocessed is shown

Figure 12 :
Figure 12: The result of 3D reconstruction viewed from two directions.

Table 1 :
The result of test group 1.