An Efficient Calibration Method for a Stereo Camera System with Heterogeneous Lenses Using an Embedded Checkerboard Pattern

We present two simple approaches to calibrate a stereo camera setup with heterogeneous lenses: a wide-angle fish-eye lens and a narrow-angle lens in left and right sides, respectively. Instead of using a conventional black-white checkerboard pattern, we design an embedded checkerboard pattern by combining two differently colored patterns. In both approaches, we split the captured stereo images into RGB channels and extract R and inverted G channels from left and right camera images, respectively. In our first approach, we consider the checkerboard pattern as the world coordinate system and calculate left and right transformationmatrices corresponding to it. We use these two transformation matrices to estimate the relative pose of the right camera by multiplying the inversed left transformation with the right. In the second approach, we calculate a planar homography transformation to identify common object points in left-right image pairs and treat themwith the well-known Zhangs camera calibrationmethod.We analyze the robustness of these two approaches by comparing reprojection errors and image rectification results. Experimental results show that the second method is more accurate than the first one.


Introduction
The process of estimating internal-external (also known as intrinsic and extrinsic) camera parameters and knowing the correct relative pose between cameras in a stereo setup has been of the interest in the computer vision field for many years.It is considered as the first and foremost important step in many 2D/3D stereo vision experiments.Much related work have been introduced throughout the past few decades, initially starting in the photogrammetry community [1,2].As mentioned in [3], these calibration methods can be divided into two broad categories: photogrammetric calibration and self-calibration.Photogrammetric calibration is performed by observing a calibration object (normally a checkerboard pattern) whose geometry in the 3D space is known for the best precision.In contrary, self-calibration is performed by extracting feature points and processing correspondences between captured images of a static scene.However, one of the constraints in most of these photogrammetric calibration methods is using common or similar field-of-view (FOV) cameras.Correspondingly, many self-calibration methods also follow the same constraint, where a few utilize advantages of using heterogeneous setups.However, extracting rich key points is challenging and sometimes could lead into erroneous approximations.In this paper, we propose two new, yet simplified, calibration approaches for a heterogeneous camera setup.Instead of using the general black-white checkerboard pattern, we design a new color checkerboard pattern, by combining two different patterns.In our first approach, we consider the checkerboard pattern as the world coordinate system and calculate the two transformation relationships between left and right cameras correspondingly.Multiplying the inverted left transformation with the right transformation gives the relative pose of the right camera with respect to the left camera.In our second approach, we use a planar homography transformation method to identify common object points in stereo images.Once these common points are estimated, we apply Zhang's method [3] to calibrate the stereo camera setup.The remainder of this paper is constructed as follows: Section 2 describes some existing stereo calibration methods for heterogeneous setups.Section 3 describes the preliminaries, including the configuration of our camera setup, the method of designing the color checkerboard pattern, and the method of separating two patterns from each other.Section 4 consists of the core of this paper, brief introductions to two stereo calibration approaches.Section 4.1 describes mono calibration method used to undistort input image sequences.In Section 4.2 we describe the matrix multiplication method and in Section 4.3 the planar homography transformation-based calibration method.Experiments performed to evaluate the accuracy of these two methods are summarized in Section 5.Besides comparing reprojection errors, we perform image rectifications to see how robust our proposed methods are.Finally, the conclusions and further discussions are drawn in Section 6.

Related Work
The popularity of wide-angle lenses, such as fish-eye cameras, has started to increase in the field of stereo vision.The wider FOV of such cameras allows users to cover a broad scene area compared to conventional cameras.These cameras have been intensively used in many recent stereo-based experiments, where quite a number of calibration methods have also been tested.Barreto and Daniilidis introduced a factorization approach without performing nonlinear minimization to estimate the relative pose between a conjugated wide-angle camera setup [4,5] using a minimum of 15 corresponding point matches.Fischler and Bolles proposed a RANdom Sample Consensus (RANSAC) [6] based polynomial eigenvalue method [7] to estimate the relative pose of a noncentral catadioptric camera system [8].Lhuillier introduced a similar approach [9] in 2008.In this method, he discussed applying a central model to estimate the geometry of the camera and a decoupling orientation translation to identify the transformation relationship.Lim et al. introduced a new stereo calibration method using an antipodal epipolar constraint [10].In addition, many optical flow estimation approaches have been adopted for pose estimations, as cited in [11].On the other hand, planar projection (or homography) based approaches have also been studied to estimate relative pose in a stereo camera rig.Chen et al. proposed a calibration method for a high definition stereo camera rig by utilizing the idea of homography transformation [12] using a marker chessboard.In year 2013, they discussed another slightly improved image undistortion and pose estimation method in their technical paper [13].
Even though these existing methods can be used to calibrate heterogeneous stereo camera setups, most of them have certain limitations and drawbacks.Most of these methods depend on geometric invariants of image features, such as projections of straight lines, or the approximations of the fundamental matrix [13].They require proper extraction/matching of point correspondences between stereo image pairs, which sometimes could be more challenging due to irregular resolutions, different FOVs, and lens distortions of cameras.In addition, the implementation of these methods is limited only for small displacement since the reliability of feature points extraction decreases when there are large FOV differences between images.The method proposed by Barreto and Daniilidis is mostly algebraic, and the linear model requires a minimum of 15 point correspondences.Precise estimation of these correspondences is more ambiguous and less accurate in difficult environments.Similarly, the method proposed by Micusik and Pajdla generalizes Fitzgibbon's technique [14] and requires 9 point correspondences, whereas the method proposed by Lhuillier requires a minimum of 7 point correspondences to calculate the fundamental matrix.The method introduced by Lim et al. imposes the constraints on the distribution of feature points.The planar homography method introduced by Chen et al. in their first research article [12] sometimes failed to detect chessboard corners properly.They proposed a solution for this problem in their second research article [13] by introducing the concept of a robust type homography transformation in which they primarily focused on processing mono video cameras instead of focusing on stereo systems.
In our article, we realized that the above limitations and drawbacks occur mainly because of using point correspondences in-between stereo image pairs.However, the two stereo calibration methods we state in this article do not depend on these sensitive point correspondences and do not show such difficulties.Instead, we use pure mathematical approaches for pose estimations.The embedded checkerboard pattern we introduce is a proper alternative for the traditional black-white checkerboard pattern and can be used in cases where common areas are not visible in images (due to FOV differences).

Focal Lengths, Field-of-Views, and Wide-Angle Cameras.
Focal length is the distance from the center of the lens to the image plane where light converges to a similar point named the focal point.Figure 1 shows how two light rays are converging into this point.The focal length of a camera and its FOV are proportionally interconnected with each other.A longer focal length results in a lower FOV, where a lower focal length results in a higher FOV.This proportional relationship allows for converging or diverging the amount of light entering the camera.This is graphically shown in Figure 2. Using a short focal length is the base idea of wideangle lenses [15,16].
The popularity of wide-angle lenses, such as fish-eye lenses, have started to increase because of their ability to cover wider viewing areas.The basement of these wide-angle lenses can be considered as the Double-Gauss lens [17], which is a compound-type lens of a positive and negative meniscus lenses on the object side and the image side, respectively.In general, all these wide-angle lenses can be categorized into two main groups: short focus lenses and retrofocus lenses.Short focus lenses are generally made of multiple glass elements whose shapes are nearly symmetrical in the front and back of the diaphragm.Retrofocus lenses use an inverted telephoto configuration, in which the front element is negative.The proportional relationship between the focal length and FOV.When the focal length is longer, the FOV is lower, resulting in only a part of the object light rays to be converged.When the focal length is shorter, the FOV becomes higher, resulting in a wider area of the object to be converged.checkerboard pattern using narrow-angle cameras has constantly been used in many existing methods.To obtain higher accurate calibration results, the pattern needs to be kept near to cameras.This orientation could sometimes result in limiting the number of poses (even though the minimum number of poses required is six as mentioned in [3]).In some situations, capturing the full area of the checkerboard pattern fails.One possible solution to resolve this occlusion problem would be using wide-angle lenses.In this paper, we have  decided to use a single wide-angle lens along with a narrowangle lens.

Designing the
However, using a wide-angle lens does not guarantee the stereo setup manages to capture full images of the checkerboard pattern.Since we use a narrow-angle camera in our stereo setup, there is a difficulty to cover the full area of the checkerboard pattern at close distance as it is illustrated in Figure 3.In order to overcome these problems and as a final solution, we have designed a new checkerboard pattern and used it instead of using the conventional black-white pattern.This new checkerboard pattern we used in our proposed methods is graphically shown in Figure 4.This special checkerboard pattern is made by combining two different color checkerboard patterns: 7 × 10 larger pattern and a 6 × 8 smaller pattern.The larger pattern (from now on mentioned as the outer pattern) is designed by redblue checker patterns, and the smaller pattern is designed by black-green checker patterns.This smaller pattern is embedded into the outer pattern (as in Figure 5(a)), making the basic color blend.Color mixing results in a secondary inner pattern with red-yellow, blue, and cyan colors inside the outer pattern.Therefore, we can think of using two individual checkerboard patterns, instead of using a single pattern.The process of designing this special checkerboard pattern is depicted in Figure 5.

Capturing Calibration Images of Special Checkerboard
Pattern.The heterogeneous stereo camera setup we have used in our experiments is depicted in Figure 6.Two Point Grey Grasshopper cameras are mounted on either side of a horizontal panning bar: left side wide-angle camera (focal length ≅ 3.5 mm) and right side narrow-angle camera (focal length ≅ 8 mm).We kept the special checkerboard pattern in front of the cameras in such a way the narrow-angle camera always sees the full area of the inner checkerboard pattern.Since the wide-angle camera has a wider FOV, it fully sees both inner and outer patterns (Figure 7).
In our experiments, we wanted to retain only the outer pattern from wide-angle camera images and the inner pattern from narrow-angle camera images.We performed RGB channel splitting to distinguish two patterns from each other.Once R channel is extracted, we managed to separately identify the outer pattern in wide-angle camera images.Similarly, we first extracted the G channel from narrow-angle camera images and inverted it to identify the inner pattern.Figure 8 shows an instance of how we managed to separately identify two patterns from each other.Figures 8(a

Stereo Calibration
4.1.Mono Camera Calibration.One of the problems of using wide-angle cameras is that they suffer from massive barrel distortions.Performing stereo calibrations without correcting distortions could lead into erroneous matrix calculations.Consequently, we start our two stereo calibration methods by first undistorting input wide and narrow-angle images.
We use the same experiment setup mentioned in Section 3.3.We kept the special checkerboard pattern at a short distance and captured left-right wide and narrowangle camera images separately.After capturing images, we followed the method mentioned in Section 3.3 to retain the outer pattern in wide-angle camera images and the inner pattern in narrow-angle camera images.We then used the well-known Zhang's method [3] to calibrate cameras independently.Figure 9 shows an instance of where wide-and narrow-angle cameras are calibrated separately.

Stereo Calibration Using Transformation Matrices.
The first stereo calibration approach is based on multiplying the two transformation matrices between wide-and narrowangle cameras.Once two cameras are properly calibrated as mentioned in Section 4.1, we then capture stereo image sequences of the checkerboard pattern from two cameras at the same time.While capturing images, we kept the checkerboard pattern at a short distance to the cameras in such a way the wide-angle camera sees the full area of the pattern and the narrow-angle camera sees the full area of the inner pattern.In this method, we considered the checkerboard pattern as the world coordinate system where the origin lies at the intersection point of first red and blue checker patterns of the outer pattern.Since we consider inner and outer patterns are two different checkerboards, the inner

Narrow-angle camera
Wide-angle camera pattern has its origin at the intersection point of first red and yellow checkers, and we shifted this toward the origin of the outer pattern by simply adding the distance between two origins.This is graphically described in Figure 10.
Taking   and   representing two transformation matrices between wide-angle and narrow-angle cameras with respect to world coordinate system, we wanted to find the relative pose of the narrow-angle camera with respect to wide-angle camera,   .We used the captured stereo image sequences to calibrate two cameras separately (Section 4.1) and estimated two 3 × 4 camera matrices (or perspective projection matrices).
The general relationship between a 3D point  world in the world coordinate system and its respective 2D point  image in Figure 7: A representation of how the special checkerboard pattern is seen in left wide-angle and right narrow-angle cameras.Checkerboard pattern is kept at a very close distance to the cameras.Wide-angle camera has a higher FOV; thus it sees the whole area of the checker pattern.The FOV of narrow-angle camera is lower; thus it always sees the inner pattern.the image coordinate system can be written as where  depicts the camera matrix.This  matrix can be further decomposed as intrinsic camera matrix  and rigid transformation matrix (or the extrinsic matrix) [, ] [18].Thus, (1) can be rewritten as where  denotes 3 × 3 rotation and t denotes 3 × 1 translation (  and   in (3), resp.). [ These intrinsic and extrinsic entries of  matrix can be easily identified using  factorization [19].
We can estimate both   and   transformation matrices by applying this generalization into wide-angle and narrow-angle cameras separately as follows: The following equation depicts the relationship between transformation matrices shown in Figure 11, which we are interested in estimating   .
We multiplied the inverse of left transformation matrix with the right transformation matrix to find the relative pose of the narrow-angle camera with respect to wide-angle camera as follows: Figure 12 graphically summarizes the whole matrix multiplication-based calibration procedure as a flow chart.

Stereo Calibration Using Planar Homography Transformation.
Figure 13 summarizes the whole process we followed to find the relative pose of the narrow-angle camera by keeping wide-angle camera as the reference.Similar to the method mentioned in Section 4.2, we first undistorted the images and used them as input data.
This second approach uses Zhang's method to perform stereo calibration, but, to apply Zhang's method, we need to know the correct relationship between point locations in two camera images.Due to the reason that the narrowangle camera only sees a partial area of the full checkerboard pattern, we could not directly identify this relationship.Therefore, we applied a planar homography transformation on the wide-angle camera images to properly project point locations into the view point of the narrow-angle camera.
To calculate the planar homography matrix , we need at least four corresponding image points between wide-and narrow-angle images.This means that we need to know at least four sets of 2D image coordinates of the checkerboard pattern.Due to the FOVs of two cameras, wideangle camera captures both inner and outer patterns, where narrow-angle camera only manages to capture the full area of the inner pattern (with some partial areas of the outer pattern).
Therefore, we decided to retain only the inner pattern in both wide-and narrow-angle images.We followed the same channel splitting method, but this time we only considered extracting the inverted G channel.This results in separately identifying the inner pattern in both camera images.We manually selected four exact common point locations in images to calculate matrix  as shown in Figure 14.According to [18], the homography transformation relationship between two 2D corresponding point locations can be summarized as is the 3 × 3 homography transformation matrix that we are interested in calculating, where   and  represent known 2D point locations we selected in wide-angle and narrowangle camera images, respectively.Using the above four point correspondences, we find this  matrix based on singular value decomposition.
After calculating  matrix, we next find chessboard corners of the outer pattern in wide-angle images.We followed the steps mentioned in [20] to find the chess corner locations accurately.We first extract R channel to retain the outer pattern and find 2D point information of all 54 corners.Next we apply  matrix to identify where these corner points projected onto narrow-angle images (Figure 15).Green circles in narrow-angle images represent these projected point locations.We adjusted these points with subpixel accuracy to maximize their cornerness criteria.Once we find 2D coordinates of common object points in both wide-and narrow-angle images, we can treat them with Zhang's method to perform stereo calibration between two cameras.

Experiments and Results
We have performed 4 experiments (2 for method 1 and 2 for method 2) to evaluate the robustness of the proposed two methods.We have performed experiments in both indoor and outdoor environments.We have used the same experiment setup mentioned in Figure 6 to perform indoor experiments, where we mounted it on top of the front mirror of a vehicle to do outdoor experiments.We used a similar number of image sequences (30 images) in every experiment.Table 1 summarizes intrinsic camera parameters for both cameras.Parameters   and   represent the focal lengths expressed in pixel units in  and  directions.  and   represent the  and  components of the principal point.Table 2 summarizes experiment results calculated for both indoor and outdoor environments from method 1 in Section 4.2 and method 2 in Section 4.3.Parameters   ,   , and   represent the components of rotation in , , and  directions, where parameters   ,   , and   represent the components of translation in , , and  directions.
We calculated and compared reprojection error values in both methods, that is, the root mean squared value (RMS) of Euclidean distances between the observed chess corner Extract "R'' Extract "G" and Split "RGB"  points in the image coordinate system (in 2D calibration images) and the corresponding projected object points.We referred to [18,21] to calculate these errors.
Also, we performed image rectifications [22] to see how accurate our calibration methods are.Experiment results affirm homography transformation method is slightly accurate compared to the matrix multiplication method.Some indoor environment rectification results generated from both methods are shown in Figures 16 and 17, where outdoor results are shown in Figures 18 and 19, respectively.There, we drew epilines (green horizontal lines) to represent the rectification error graphically and additionally calculated the absolute  value differences of the inner pattern's chessboard corner locations to represent it mathematically.To represent rectification error mathematically, we selected four stereo image pairs from the outdoor environment that are rectified using calibration parameters of the two methods.From each image set, we extracted inner pattern areas, estimated 35 chess corner locations (as mentioned in [20]), and calculated  value differences (in pixels) between corresponding point locations in wide-and narrow-angle images.We summarized the average difference of each individual image set along with their overall average (term Average Err.).Table 3 depicts these results in pixels.
We performed another experiment to evaluate the accuracy of calibration using the embedded checkerboard pattern and a general black-white checkerboard pattern.We kept both patterns at the same position, where both cameras manage to see the full area.We calibrated the images of the blackwhite pattern according to the general version of Zhang's method.We used our proposed homography transformation method to calibrate the images of the embedded pattern.Similarly, we performed image rectifications and calculated 2D pixel positions to confirm that the combination of our embedded pattern and homography-based method gives better results compared to the general method when using the black-white pattern (Figure 20).

Conclusions
In this paper, we proposed two new methods to calibrate a heterogeneous stereo camera setup using a special colored checkerboard pattern.The heterogeneous camera setup consisted of a left wide-angle fish-eye lens camera and a right narrow-angle conventional camera.Because of the viewing angle irregularities, we could not use the conventional blackwhite checkerboard pattern at a short distance to the cameras.Therefore, we designed a new color checkerboard pattern by combining two different size checkerboard patterns.We embedded the small checkerboard pattern with the larger checkerboard pattern, letting their colors blend.This color blending results in a special checkerboard pattern, which consists of an outer pattern and an inner pattern.This checker pattern is kept at a very close distance to cameras and captured calibration images sequences to improve estimated results.We used RGB channel splitting method to separately identify two patterns from each other.
In our first method, we perform stereo calibration between the cameras by calculating left and right transformation matrices.In our second method, we calculated a planar homography relationship between two cameras to identify common object point locations of stereo images.We projected chessboard corner locations of the outer pattern into the view point of the narrow-angle camera by treating them with the calculated homography relationship.Zhang's calibration method was applied to calibrate the stereo camera rig afterwards.We created rectification results to evaluate the robustness of our two proposed methods.There, we realized the second method was slightly accurate than the first.
As in future improvements, we are planning to parallelize both calibration approaches in GPU-based Nvidia Jetson TK1 board to speed up calibration by reducing the computation time and to use it in an embedded smart vehicle system for lane detection.In addition, we are planning to enhance the accuracy by updating calibration results using the wellknown 5-point algorithm and a parallelized SIFT-GPU based corresponding point extraction.

Figure 1 :
Figure 1: The concept of focal length.The rays are converged into the focal point.

Figure 2 :
Figure2: The proportional relationship between the focal length and FOV.When the focal length is longer, the FOV is lower, resulting in only a part of the object light rays to be converged.When the focal length is shorter, the FOV becomes higher, resulting in a wider area of the object to be converged.

Figure 3 :
Figure 3: An example showing how the viewing angles are different in wide-angle and narrow-angle cameras.Left side consists of the wide-angle camera, and it covers a larger area of the scene.Right side consists of the narrow-angle camera, and it only covers a smaller part of the scene.The general checkerboard pattern is partially seen by the narrow-angle camera.

Figure 4 :
Figure4: The special colored checkerboard pattern used instead of the conventional black-white checkerboard pattern.The pattern is a combination of two differently colored checkerboard patterns.The outer pattern consists of red-blue checker patterns and the small inner pattern with red-yellow-blue-cyan color patterns.Aspect ratio between two patterns is 2 : 1.

Figure 5 :
Figure 5: The method of generating the special color checker pattern.(a) The 7×10 red-blue outer pattern is mixed with the 6×8 black-green smaller pattern.(b) The basic colors blend and result in red, yellow, blue, and cyan colors.This mixing results in the inner pattern as shown in Figure 4.
) and 8(b) show left wide-angle and right narrow-angle camera images.We can easily identify the outer pattern from the wide-angle image by extracting R channel and the inner pattern from the narrowangle image by extracting the inverted G channel.

Figure 6 :
Figure 6: The stereo camera setup we used in our experiments.Left side consists of the wide-angle camera and the right side consists of the narrow-angle camera.Both cameras are mounted on a horizontal panning bar.The special checkerboard pattern is kept very near to the camera setup.

Figure 8 :
Figure 8: The method of separately identifying outer and inner patterns from wide-angle and narrow-angle cameras, respectively.Original wide-angle and narrow-angle camera images are shown in (a) and (b).(c) states the extracted R channel of wide-angle camera image, where (d) states the inverted G channel of narrow-angle camera image.

Figure 9 :Figure 10 :
Figure 9: An instance of mono calibrations.First, second and third columns represent original, split channel, and undistorted images.(a) R channel is extracted from wide-angle camera images for calibration.(b) Inverted G channel is extracted from narrow-angle camera images for calibration.

Figure 11 :
Figure 11:  Calculating the relative pose between wide and narrowangle cameras using two transformation matrices obtained with respect to the world coordinate system.

Figure 13 :
Figure 13: Stereo calibration method by finding a planar homography relationship between wide and narrow-angle camera images.Four points from the inner pattern in both camera images are used to calculate the homography transformation.

Figure 14 :
Figure 14: Finding planar homography transformation between wide-and narrow-angle camera images using the inner pattern.

Figure 15 :
Figure 15: Applying homography to project 54 corner points of wide-angle image to narrow-angle image.Green points depict respective projected corner point.

Figure 16 :
Figure 16: Rectified canvas result for indoor environment from method 1. Green horizontal lines represent rectified  lines.The bottom edge has some rectification errors.

Figure 17 :
Figure 17: Rectified canvas result for indoor environment from method 2.

Figure 18 :
Figure 18: Rectified canvas result for outdoor environment from method 1.Some small rectification errors exist.

Figure 19 :
Figure 19: Rectified canvas result for outdoor environment from method 2.

Figure 20 :
Figure 20: Rectification results comparisons when using general pattern and the embedded pattern.(a) depicts the result for general pattern rectification and (b) depicts the result for colored checker pattern.

Table 2 :
Stereo calibration results for both methods in indoor and outdoor environments.Stereo calibration method by multiplying left-right transformation matrices.Given input images are undistorted using undistortion coefficients calculated in mono camera calibration step.

Table 3 :
Comparisons of rectification errors for 4 rectified stereo image pairs.