Robust Calibration of Cameras with Telephoto Lens Using Regularized Least Squares

This Article is brought to you for free and open access by the Electrical & Computer Engineering at ODU Digital Commons. It has been accepted for inclusion in Electrical & Computer Engineering Faculty Publications by an authorized administrator of ODU Digital Commons. For more information, please contact digitalcommons@odu.edu. Repository Citation Liang, Mingpei; Huang, Xinyu; Chen, Chung-Hao; Zheng, Gaolin; and Tokuta, Alade, "Robust Calibration of Cameras with Telephoto Lens Using Regularized Least Squares" (2014). Electrical & Computer Engineering Faculty Publications. Paper 67. http://digitalcommons.odu.edu/ece_fac_pubs/67


Introduction
In various vision based applications, a camera with a telephoto lens is often useful to acquire detailed information of objects.It could capture high resolution face images for the purpose of recognition and reconstruction even when a user is at a distance [1].It also could obtain eye images with rich iris textures when a user is several meters away from the camera [2,3].In [4], a telephoto lens is used to observe objects under the influence of optical turbulence.By combining with a wide-angle camera, a robotic vision system has been shown in [5], which is suitable for remote surveillance or minimally invasive surgical interventions that could have a higher resolution than typical commercial endoscopes.As the field of view of a telephoto lens could only have a few degrees (e.g., around 8 degrees for a 300 mm telephoto lens), in order to either track objects or reconstruct complete views, an accurate estimation of camera parameters is required.
In the photogrammetry community, camera calibration usually is done by computing the projection matrix using accurate 3D points and corresponding 2D observations [6,7].However, in practice, it could be difficult or expensive to build an object with accurate coordinates, especially in a large working space.In the area of computer vision, the calibration technique [8] that requires only a planar pattern (e.g., a checkerboard pattern) is widely used.In this technique, a planar pattern is placed with different orientations and at different distances from the camera.Homographies are estimated between the planar pattern and its observations.These homographies could form a homogeneous system that is used to solve the image of the absolute conic.The intrinsic and extrinsic parameters are then computed by using the estimated homographies and the image of the absolute conic.In the final step, the maximum likelihood estimation (MLE) is applied to estimate the radial distortion and refine the intrinsic and extrinsic parameters by minimizing geometric errors.This technique has further been evaluated with respect to image noise level, number of planes, and orientation of the model plane.Various autocalibration techniques [9,10] are also proposed to estimate fixed or varying intrinsic parameters without predefined calibration patterns.The basic idea is that the absolute conic is fixed when a camera is moving rigidly.
Since focal lengths of cameras in many vision applications are relatively short, most of existing algorithms consider image noise as a major source of estimation uncertainties and limited research has been conducted on the uncertainties caused by focal length [11,12].In [11], Strobl et al. found that the narrow field of view makes the calibration more difficult due to lack of required evidence on perspectivity.In order to improve the calibration accuracy, the camera with a narrow field of view is mounted on a robotic manipulator from which the rigid motions can be read.These rigid motions could provide more constraints for solving the intrinsic parameters and a relative geometric relation between the camera and the robotic manipulator.Similarly, a pan-tilt unit could be used during the calibration as shown in [12].
There are mainly two contributions in this paper.Firstly, we present a first-order error analysis that shows the relation between estimation uncertainties and focal length.Although authors in [11,12] briefly described the calibration problem caused by long focal length, the error analysis with respect to focal length has not been studied so far.Secondly, we propose a robust algorithm without using additional devices, which is based on the regularization term defined by the prior of the image of the absolute conic.
The remaining of this paper is organized as follows.Section 2 introduces necessary notations and background of existing algorithm using estimated homographies.Section 3 gives the error propagation from image noise to camera parameters.Our calibration algorithm is proposed in Section 4. Section 5 shows the experiments on simulation and real data.The conclusion is given in Section 6.

Notation and Background
In this section, we start with the notation and then briefly introduce the calibration technique proposed in [8].
The homography between the planar pattern and image plane is denoted by  = [ℎ 1 ℎ 2 ℎ 3 ], the intrinsic matrix is given by where ( 0 , V 0 ) is the coordinates of the principal point,   =    and   =    are scale factors, and   and   are the number of pixels per unit distance in image along  and  directions.The image of the absolute conic is  =  −  −1 .
Given an image of a planar pattern, two constraints can be imposed on the intrinsic parameters, ℎ  1 ℎ 2 = 0 and ℎ  1 ℎ 1 = ℎ  2 ℎ 2 .Therefore, a constrained optimization can be formed by where  V is a 6 × 1 vector extracted from  and  is a 2 × 6 matrix constructed from entries of .The intrinsic matrix  is computed by the Cholesky factorization.Since the close form solution is obtained through minimizing algebraic errors, the maximum likelihood is further applied to refine the results by minimizing geometric errors.

Image plane
Planar pattern (x  , y  ) The calibration performance with respect to image noise level, the number of planes, and the orientation of the model plane are also evaluated in [8].Based on the computer simulations, the errors increase linearly with the image noise level and decrease when more images are used.The best orientation of the model plane is around 45 degrees.

Covariance of the Estimated Intrinsic Parameters
In order to find out the relation between focal length and uncertainties of camera parameters, it is not sufficient to only have a point estimate of the parameters.In this section, we present a first-order approximation to compute covariance of estimated parameters.
Let us consider two cameras with different focal lengths  1 and  2 (assume  1 >  2 ) and sharing a same image plane.The origin  is located at the center of projection of camera 1.Through a single point (  ,   ) on the sensor, camera 1 observes a 3D point X 1 = ( 1 ,  1 ,  1 )  and camera 2 observes a 3D point X 2 = ( 2 ,  2 ,  2 )  , while X 1 and X 2 are located on a same planar pattern.We also define a transformation  that transforms the 3D points X 1 and X 2 to a coordinate system on the planar pattern such that new depths Z1 and Z2 are equal to 0. Let us denote x1 =  ⋅ ( 1 ,  1 )  and x2 =  ⋅ ( 2 ,  2 )  .The configuration is shown in Figure 1.
As this configuration consists of the same image plane, same 2D observations, and same orientations and locations of the planar pattern, the major difference between two cameras is the focal lengths.For simplicity, we assume all the pixels are square, we have Mathematical Problems in Engineering Since the center of projection of camera 2 is at ( 1 −  2 , 0)  , through the same point (  ,   ), we have Putting this together with (3) leads to the formula Let x = (,) be the image coordinate of the point (  ,   ) on the sensor.Assuming that two cameras have same image resolution and the principal point, x is same for both cameras.Let us further assume that the noise is limited to the observed image with covariance Σ x , the covariance of the intrinsic parameters Σ k is where  k ,   V , and  h are the Jacobian matrices evaluated at k, ωV , and ĥ, respectively.k,  V , and h are the vectors made up of the entries of the intrinsic matrix , the image of the absolute conic , and 2D homography .As Σ x ,  k , and   V only depend on k and  V and are same for both cameras, we only need to analyze the relation between  h of two cameras.The Jacobian matrix   h = x  /(ℎ) for th observed point is given by Based on ( 5) and (7), it is not difficult to find that where h 1 and h 2 are vectors made up of the entries of homographies between the planar pattern and camera 1 and 2,  1, and  2, are depths of th 3D points observed by cameras 1 and 2, respectively.Σ  is the covariance matrix of the th measured image point.From this equation, we can find that the covariance of 2D homography is also affected by focal length and orientation and depth of the planar pattern.Notice that  2 −  1 is usually far less than depths  1, and  2, ; we could approximate ( 2, +  2 −  1 )/ 1, by  2, / 1, .Since image resolution and focal length are fixed for two cameras, in order to reduce the uncertainties of the estimated homography, one possible direction is to increase the pan and tilt angles of the planar pattern so that  2, / 1, is very close to 0. However, as mentioned in [8], the best orientation is around 45 degrees, which means that this ratio cannot be very small.This direction is also not feasible in practice due to the limited depth of field.When the region of the planar pattern is outside the depth of field, the sharpness of the region decreases and image noise modeled by Σ x increases.Moreover, as the planar pattern could be considered as being uniformly distributed within the field of view of a camera, expectations of both  1, and  2, are close to the depth  0 shown in Figure 1.Therefore, for simplicity, it is reasonable to approximate  2, / 1, as 1.As a result, (8) could be simplified to Σ h 2 = ( 2 / 1 ) 2 ⋅ Σ h 1 , and the relation between covariance matrices of intrinsic matrices of two cameras is given by Therefore, uncertainties of intrinsic parameters k increase when focal length increases.Since extrinsic parameters for each image can be determined by intrinsic parameters and  Figure 5: Uncertainties of   ,   ,  0 , and V 0 using Zhang's algorithm [8].
the corresponding homography, it is easy to find out that the uncertainties of extrinsic parameters also depend on focal length.
One might think that it is possible to reduce the uncertainties by choosing the affine camera model.The intrinsic matrix in the affine camera model could contain less parameters (i.e., does not have a principal point), and one way to avoid over-fitting problem is to choose a simpler model.However, the scale factors still exist in the intrinsic matrix of the affine camera model.The similar derivations shown in this section can be easily extended to the affine camera model.Therefore, it can be shown that estimation uncertainties using the affine camera model also increase when focal length increases.

Calibration Using Regularized Least Squares
When a long focal length is used, the matrix  that is used to estimate  V (shown in (2)) is ill conditioned.As a result, large perturbations of the intrinsic parameters can have only small changes in the error sum of squares.Since it is often difficult to obtain other data points outside the scope of the sensor that has a limited physical dimension, in this section, we apply a simple and effective prior of the image of absolute conic to reduce the uncertainties.
First, focal length is set as the one provided by the camera.Although this value is different from the focal length in the pin-hole camera model, they are usually in the same order.The number of pixels   and   can be computed by using sensor size and image resolution.Skew factor is close to 0. The principal point is located around the middle of an image.This location is a close approximation according to [13], which shows that the principal point varies around the image center with some nonlinear patterns when zoom and focus factors vary.Thus, the prior knowledge of the intrinsic parameters can be denoted as ( 1 ,  2 ,  3 ,  4 ,  5 )  = (  ,   , 0, , )  .
One possible solution is to apply this prior directly for the estimation of .However, it could require a nonlinear optimization due to the Cholesky decomposition.In order to obtain a close form solution, we transform it to the prior of the image of absolute conic based on  =  −  −1 .Hence, the prior used in our algorithm is defined by (  (  (  (  (  (  (  (  (  (  (  ( ( ) . (10) Notice that we normalize  such that the first entry of  is 1.This is different from the original constraint ‖ V ‖ = 1 in (2).The reason is that some entries are very close to 0 when long focal length is used and it could be numerically unstable for solving the intrinsic parameters k.For example, for a 300 mm lens and around 4000 × 3000 image resolution, some entries of  are in the order of 10 −10 and some entries during the Cholesky factorization could be in the order of 10 −20 when ‖ V ‖ = 1 is applied.Figure 6: Uncertainties of   ,   ,  0 , and V 0 using our algorithm.
The original homogeneous system in ( 2) is then converted to an inhomogeneous system by applying the prior from (10): for an appropriate value of , where  = [] ( is the first column of ) and ωV and c are 2-6 elements of  V and , respectively.The estimate   V can be obtained by solving corresponding unconstrained regularized least squares problem for some positive constant .The expectation of ω V can be computed by Thus the estimator from ( 12) is biased after introducing the prior c.The second term of this equation is the bias.As  increases the bias increases, and expectation of ω V converges to c eventually.In order to evaluate the covariance of ωV , let us define the function: Based on the implicit function theorem in [14,15], the Jacobian  ωV can be approximated by where (Φ/ ωV ) −1 and Φ/h can be computed by where b  is 5 × 1 vector of the th row of (   + ) ωV −   .
The covariance of ωV is given by Since  and  are independent from  and only (   + ) −1 depends on the , we can see that the covariance Σ  V decreases as  increases.The larger the , the closer the ωV is to c.If we consider the mean squares error, it is possible to select an optimal value of  > 0 at which the mean squared error from testing set is minimized.In practice, we could divide 2D points on a planar pattern into training and testing sets and apply the cross-validation to choose the optimal .

Experiments
We tested our proposed algorithm on simulated data and real data over a large range of settings of focal lengths and image noise.

Simulations.
In our simulations, image resolution is set to 2048×1536.Sensor size is 23.6×15.8mm and focal lengths are 50 mm, 100 mm, 200 mm, 300 mm, 400 mm, and 500 mm.Skew factor  is set to 0.009.The principal points ( 0 , V 0 ) are set to the image center.Table 1 gives focal lengths and corresponding scale factors used in the experiments.Gaussian noise with  = 0 and  = (1, 3) are added to the 2D observations.Since depth of field is limited when a long focal length is used, the observed points could be easily blurred when pan and tilt angles are large.Thus, we use a large standard deviation ( = 3) of image noise to further test the robustness of our algorithm.The planar pattern is generated randomly with different pan/tilt angles and at different depths from the camera.Angles are uniformly distributed between −60 and +60 degrees.Foreshortening effects are not considered in the simulations.Depths are also uniformly distributed within 6 meters.As the calibration technique in [8] is widely used in the area of computer vision, we implemented this algorithm as a baseline in order to compare calibration performance between existing algorithms and our algorithm.In our simulations, we add 5% offsets to the priors of both the focal length and the principal point.We conducted 20 trials for each configuration.
Figure 2 shows a comparison of uncertainties between the close-form solutions and the solutions from MLE when  = 300 mm is used.Figure 2(c) shows that RMS reprojection errors are reduced by minimizing the geometric errors.However, as the cost function is not a convex function and initial guess from the close-form solution could be far away from the global minimum, the uncertainties of intrinsic parameters cannot be reduced by the nonlinear refinement as shown in Figures 2(a) and 2(b).The results are similar to other settings in Table 1.This experiment shows that the MLE can reduce RMS errors for training data points.However, it cannot reduce the uncertainties of camera parameters.
Figures 3 and 4 show the relation between focal lengths and the uncertainties of intrinsic parameters.It shows that uncertainties increase with the increase of focal length.The absolute errors of the principal point could be very large.It indicates that the estimated principal point could be very far away from the image plane for long focal lengths.Figures 3(c), 3(d), 4(c), and 4(d) show the results by using our algorithm.the uncertainties of both the focal length and the principal point are reduced to few percents.The estimated values by using our algorithm converge to the bias (i.e., 5%), which is consistent with (13).This further means that our algorithm should be mainly used for the camera with a long focal length (e.g.,  ≥ 200 mm as shown in Figures 3 and 4).When focal length is short, we need to choose either algorithm [8] or a very small .

Real Data.
We also test our algorithm for the real data.The camera to be calibrated is a Canon EOS 450D.The sensor size is 22.2 × 14.8 mm.Image resolution used in the experiments is 2256 × 1504.We use a 300 mm telephoto lens in the experiments.The prior of the skew factor is set to 0. The prior of the principal point is set to the image center.The   and   are computed based on the sensor size, image resolution, and focal length.Table 2 shows the priors for the intrinsic parameters.
18 images of a planar pattern with different orientations and at different depths are captured within 6 meters.9 images of them are randomly selected every time for calibration and the same calibration procedure is repeated 20 times.Figure 5 show the calibration results using Zhang's algorithm.We can see that the uncertainties of the intrinsic parameters are very large when a 300 mm telephoto lens is used.The estimated principal point could be far away from the image plane, which in practice is not reasonable.
Figure 6 shows our calibration results with different .Similar to the results of simulations, the uncertainties are reduced greatly and estimated intrinsic parameters converge to the priors when  increases.

Conclusion
As a camera with a telephoto lens could be used in various vision based systems, it is necessary to calibrate the camera accurately.Many existing algorithms that are designed for cameras with relatively short focal lengths could cause large uncertainties of estimated parameters even that the RMS reprojection errors of training data are small after a nonlinear optimization.In this paper, we first give a detailed error analysis that shows the relation between uncertainties and focal length.Then we propose a robust calibration algorithm based on the regularized least squares to reduce the uncertainties.Looking into future, we will apply our approach to the camera network that contains the camera with a telephoto lens in the area of remote surveillance and scene reconstruction.

Figure 1 :
Figure 1: Two cameras with the same image plane and different focal lengths.

Figure 2 :
Figure 2: Comparison of uncertainties between the close form solution and the MLE solution [8] ( = 300 mm,   = 26034, and   = 29163).The image noise is  = 0 with  = 1.The mean relative errors of   are 4.74% and 4.72%, the mean absolute errors of  0 are 1717 and 1689 pixels, and the RMS reprojection errors are 19.5 and 3.2 pixels, for the close form and the MLE solutions, respectively.

Figure 3 :
Figure 3: Comparison of uncertainties of   between our algorithm and [8].

Figure 4 :
Figure 4: Comparison of uncertainties of  0 between our algorithm and [8].
Distribution of  x and  y (Zhang's algorithm) Distribution of  x with different  (our algorithm)

Table 1 :
Scale factors of the camera used in simulations.

Table 2 :
Priors of the intrinsic parameters.