Research on Calibration Method of Binocular Vision System Based on Neural Network

In binocular vision inspection system, the calibration of detection equipment is the basis to ensure the subsequent detection accuracy. *e current calibration methods have the disadvantages of complex calculation, low precision, and poor operability. In order to solve the above problems, the calibration method of binocular camera, the correction method of lens distortion, and the calibration method of projector in the binocular vision system based on surface structured light are studied in this paper. For lens distortion correction, on the basis of analyzing the traditional correction methods, a distortion correction method based on radial basis function neural network is proposed. Using the excellent nonlinear mapping ability of RBF neural network, the distortion correction models of different lenses can be obtained quickly. It overcomes the defect that the traditional correction model cannot adjust adaptively with the type of lens. *e experimental results show that the accuracy of the method can meet the requirements of system calibration.


Introduction
With the development of modern electronic technology, the application of 3D detection in the machining field is more and more mature. At present, the common 3D detection methods mainly include contact and noncontact. In the traditional reverse engineering, the common method of 3D object detection is the contact measurement technology represented by coordinate measuring machine (CMM). e advantage of this method is that it is easy to operate. However, it has a large error for the soft measured target. Besides, the cost of special large-scale CMM is also very high [1]. With the development of computer technology, the application of machine vision and noncontact measurement technology in the mechanical manufacturing system has gradually become a research hotspot.
Structured light detection is a representative method of noncontact measurement technology [2]. In the detection process, the projector projects structured light with specific rules to the target surface. e stripe of structured light changes with the depth of the target surface, resulting in distortion. Due to the different positions of the cameras on both sides of the projector, the distorted images captured by the cameras are also different. e distorted structured light image contains the depth information of the measured target surface and the relative position of the projector and camera. By analyzing and calculating the distortion characteristics, the target depth information can be obtained, and the target 3D coordinates can be achieved. In the process of calculation, in order to determine the relationship between the 3D geometric position of a point on the surface of a space object and its corresponding point in the image, it is necessary to establish the geometric model of the camera and projector. e parameters of these geometric models are the parameters of the camera and projector, including internal parameters, external parameters, and distortion parameters. In most cases, these parameters can only be obtained by experiment and calculation. is process of solving parameters is called system calibration. e accuracy of system calibration directly determines the accuracy of subsequent measurement and calculation [3][4][5]. erefore, it is very important to study the high-precision and high-efficiency system calibration method for the 3D detection system.

Calibration Principle of Binocular Structured Light System
For a monocular vision system with only one projector and one camera, an equivalent camera can be created by rigid rotation and translation of the projector and camera. However, the premise of the transformation is that the internal parameters of the projector and camera are the same. In engineering practice, the above conditions are generally difficult to meet. erefore, a binocular vision system composed of one projector and two cameras is usually used for 3D reconstruction [5,6]. Generally speaking, the projector can be virtual as a pinhole imager, and the camera can be virtual as a linear camera. When the internal parameters of two cameras are the same and the optical center is in the same horizontal plane, the image height is the same. e corresponding points can be determined by searching for feature points at the same height. erefore, when building a binocular vision system, two cameras of the same model can be selected and placed on a horizontal pan tilt. e projector is located between the two cameras. Two cameras are divided into two angles of view and simultaneously collect the projection image projected by the projector on the 3D target. e calibration process of the structured light measurement system is the process of solving the functional relationship among the 3D coordinates of the measured point in space, the image information collected by the camera, and the structured light information. e parameters of this function include camera parameters, projector parameters, and the transformation relationship between camera coordinate system and world coordinate system. e calibration of the binocular structured light system includes camera calibration, relative position calculation of two cameras, camera distortion correction, projector calibration, and relative position calculation between the projector and camera [7].

Principle of Camera Imaging.
ere are two cameras in the binocular vision detection system to obtain the target data.
ese cameras have their own positions and parameters. e final result of the reconstruction depends on the relationship between the spatial position of the target and the corresponding image points in the camera, that is, the set model and parameters of the camera. erefore, it is necessary to model the camera and obtain the relevant parameters for 3D reconstruction of the target. Binocular vision detection is to calculate the camera coordinates corresponding to each point according to the coordinates of each point in the distorted structured light plane image obtained by the camera and then obtain the 3D world coordinates corresponding to each point on the target surface [8,9].
As shown in Figure 1, let the upper left corner of the plane image coordinate be the coordinate origin O 0 , and a known point in the image is D (u, v). u and v are the number of pixels in the horizontal and vertical directions, respectively. e image coordinate system (O 1 − xy) was established. Its origin position is (u 0 , v 0 ). If the image coordinates corresponding to point D (u, v) are (x, y), then the corresponding relationship between (u, v) and (x, y) is shown in the following equation: Its homogeneous form is shown in the following equation where dx and dy are the size of each pixel: According to the imaging principle of the pinhole camera, the relationship between the image coordinate system and the two is shown in Figure 2 where O 2 is the optical center of the camera and the line O 1 O 2 is the focal length f of the camera.
As can be seen from Figure 2, the world coordinate system can be obtained by rotation and translation of the camera coordinate system. Let the rotation matrix be R and the translation matrix be t. e relationship between the world coordinate system and the camera coordinate system can be obtained as shown in the following equation: According to the similar triangle principle, the relationship between the coordinates of point D(x, y) on the plane image and its corresponding point D(x c , y c , z c ) in the camera coordinate system is shown in the following equation: Equation (5) is the homogeneous form after sorting.
From equations (1), (3), and (5), the relationship between plane image coordinates and world coordinates can be obtained, as shown in the following equation: Among them, X � [x w , y w , z w ] T ; M 1 is related to f, u 0 , v 0 dx, and dy and determined by the internal structure of the camera, and is called internal parameter. M 2 is determined by the orientation of the camera relative to the world coordinate system, which is called external parameter. M is called the projection matrix.

Camera Calibration.
Camera calibration is the process of obtaining the internal and external parameters of the camera. For the calibration plate, the 3D coordinates (x w , y w , z w ) of each feature point are known. e plane image coordinates of feature points are also known. erefore, as long as there are enough characteristic points, the matrix M can be obtained and then M 1 and M 2 can be obtained. For each characteristic point on the calibration plate, the relationship is shown in the following equation: e system of equations can be obtained by eliminating z ci as follows: It can be seen from equation (8) that each characteristic point can correspond to two independent equations. erefore, the 12 unknowns in M matrix can be obtained from 12 equations obtained from 6 characteristic points. e solution method is the least square method. e more the number of feature points, the smaller the error. For n characteristic points, 2n equations are obtained as shown in the following equation:
x wn y wn z wn 1 0 0 0 0 −u n x wn −u n y wn −u n z wn It can be seen from equation (6) that the multiplication of M matrix by any constant other than 0 does not affect the relationship between [x w , y w , z w ] and [u, v]. erefore, m 34 � 1 can be specified in equation (9). At this time, the number of unknowns of M matrix is reduced to 11. Let these 11 unknowns be vector m, then equation (9) can be abbreviated to the following equation: where K is a 2n × 11 matrix, m is an 11 dimensional unknown vector, and u is a 2n-dimensional vector. When 2n > 11, the solution of the equation obtained by the least square method is shown in the following equation: e larger the value of 2n, the smaller the error. Finding vector m is to get 11 unknowns in M matrix. e last unknown number m 34 is solved as follows; equation (6) can be written as the following equation: where α � 1/dx. So, m 34 m T 3 � r T 3 . Since r 3 is the third row of an orthogonal array of units, |r 3 | � 1. From this, we can get After 12 unknowns of M matrix are obtained, each element in internal and external parameter matrix M 1 and M 2 can be obtained further.

Calculation of Relative Position between Two Cameras.
In binocular vision camera calibration, in addition to calculating the internal and external parameters of each camera, it is also necessary to calculate the relative position between the two cameras. For the two cameras, there are where X w � [x w , y w , z w 1] T . After X w is eliminated, erefore, the relative position between two cameras can be represented by R and t as follows: (16)

Calibration of Projector.
e projector can be regarded as a reverse working camera [10]. erefore, the mathematical model of the projector can be represented by the pinhole camera model shown in equation (6). Although the mathematical model of the projector is the same as that of the camera, the projector cannot directly get the pixel coordinates of each feature point on the projector image plane. e solution is given in reference [11]; the projector projects the horizontal and vertical gray code fringes onto the calibration plate in the order of continuous subdivision. After the camera captures the image, the direct and indirect light components are calculated and the threshold segmentation is performed. en, the gray code decoding algorithm is used to obtain the coordinates of each point on the image plane of the projector.

Traditional Lens Distortion Correction Method.
e ideal pinhole model is only an approximation of the real lens model. e actual camera and projector are different from the pinhole model because of the different lens structure and the processing error and assembly error in the production process. For ordinary lens, especially for wide-angle lens, lens distortion should be considered [12]. e most important influence on imaging is radial distortion. Let the radial distortion parameters be k 1 and k 2 . en, there are where (x ′ , y ′ ) is the image coordinate obtained from the single hole camera model and (x, y) is the actual image coordinate. Formula (17) only considers the radial distortion and ignores the high-order term. Eccentric distortion and thin prism distortion should be considered in real lens. Because of the difference of the optical model and the assembly error, they cannot be expressed by the same mathematical model. In reference [4], a lens distortion correction method based on BP neural network is proposed. However, BP network is slow in calculation and easy to fall into local optimum, so it cannot meet the real-time and accuracy requirements of 3D detection.

Lens Distortion Correction Based on RBF Network.
RBF network is an efficient forward neural network. e relationship between input layer and hidden layer is nonlinear. ere is a linear weighted relationship between the hidden layer and the output layer [13]. is structure avoids the tedious calculation of BP network. It has not only good nonlinear approximation ability but also fast computing ability. It is especially suitable for nonlinear prediction from n-dimensional space to m-dimensional space. e RBF network structure of lens distortion correction is shown in Figure 3. e input signal (x, y) is the coordinates of the real shot image. e output signal (x ′ , y ′ ) is the actual image coordinates, which can be defined as the coordinates of feature points on the calibration board. e number of nodes in the hidden layer is the number of samples n. e input vector of the system is d � [x, y] T . e output vector is d e element w ij is the weight between the i-th node in the hidden layer and the j-th node in the output layer. e radial basis function φ (d i , d p ) is Gaussian kernel function where d i is the i-th input vector and d p is the center point vector.
e system provides n characteristic point samples on the calibration board. According to the network structure, the system output is In order to avoid each radial basis function being too sharp or too flat, the definition of the expansion constant of the radial basis function is shown in the following equation: where d max is the maximum distance among samples and n is the number of samples.
Learning of system is divided into two stages. e first stage is unsupervised learning. e concrete work is to solve the center and variance in the hidden layer. e second stage is supervised learning. e specific work is to solve the weight matrix from implicit layer to output layer. e adjustment of weight can be realized by the least mean square error. e weight adjustment formula is Among them, d j ′ is the j-th expected value.

Experimental Process.
In this paper, the binocular detection system shown in Figure 4 is used to verify the above algorithm. e camera pixel is 1.3 million, and the measurement format is 200 mm × 150 mm. e nominal scanning accuracy is 0.01 mm. e calibration board used in the experiment is shown in Figure 5. ere are 11 × 13 � 169 regularly arranged feature points on the calibration board, including 17 locating points.
In the calibration experiment, the position of the calibration plate is fixed first. e projector projects the gray code structured light to the calibration plate as shown in Figure 6. e camera takes pictures. e attitude of calibration board is changed, and the above work is repeated. e calibration board is placed in 8 different positions in the measurement space, as shown in Figure 7. Each camera obtains 8 images for system calibration. In calibration, firstly, the edge of the image is extracted at pixel level to identify the mark points and fit the center point and number the marker points according to the locating points. en, the 3D coordinates of the positioning point are reconstructed by using the location of the positioning point in the first two pictures. Next, the 3D coordinates of the landmarks are reconstructed with the rest of the images. Figure 3: e RBF network structure of lens distortion correction.   Security and Communication Networks

Data Analysis.
After obtaining the 3D coordinates of the marker points, the internal and external parameters of the camera can be calculated according to the method mentioned above, as shown in Tables 1 and 2. In the experiment, the reprojection method is used to verify the accuracy of the calibration data. According to the parameters obtained from the calibration, the positioning points are reprojected to the image plane of the camera and compared with the actual image points. e traditional method and the method proposed in this paper are used to calibrate, respectively. e calculated error is shown in Figure 8. On the left is the result of the traditional calibration method. On the right is the calibration result of the method.

Security and Communication Networks
When the two algorithms are used, the residual data corresponding to the eight attitudes of the calibration board are shown in Tables 3 and 4.

Conclusion
In this paper, the calibration algorithm of the camera and projector in the binocular vision structured light detection system is introduced. e ideal image and the actual image coordinates are taken as the input and output system, and the image distortion correction system based on RBF neural network is constructed. e actual camera distortion correction calculation is completed by using the good nonlinear fitting ability of neural network. Experimental results show that the algorithm can overcome the shortcomings of traditional methods, and the detection results can meet the actual needs.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest regarding the publication of this paper.