Linear Correction and Matching Method for 3D Line Structure Reconstruction

the original


Introduction
Using the camera imaging model to recover the 3D structure of the object from the acquired 2D image sequence is one of the classic problems in the field of computer vision. e 3D reconstruction refers to the establishment of mathematical models suitable for computer representation and processing of 3D objects. e 3D reconstruction is the basis for processing, manipulating, and analyzing the properties of 3D objects in a computer environment. It is also a key technology for establishing virtual reality that expresses the objective world in a computer. How to make computers perceive the 3D environmental information has always been one of the goals in the field of computer vision. e development of computer vision and deep learning has provided significant enhancement in fields such as autonomous driving, biometrics, video recognition, and drones. However, if these areas want to further improve, 3D reconstruction may be a good breakthrough. e existing 3D reconstruction technology generally uses structure-from-motion (SfM) [1] approaches and multiview stereo (MVS) [2] pipelines (e.g., PMVS [3]or SURE [4]). e former can obtain the sparse point cloud model of the scene and the camera pose information and apply it to MVS to develop a 3D dense point cloud model. However, because the feature point dataset is very large, the MVS algorithm has a slow processing speed, which often takes a large amount of time and computing memory. In addition, viewing in the point cloud viewer has become extremely difficult. Moreover, the image-based 3D reconstruction technology is affected by factors such as lighting and occlusion when extracting feature points. At the same time, it is sensitive to the accuracy of feature point matching and the accuracy of camera correction when calculating the camera projection matrix and solving space points. Correctly extracting and matching the feature points and accurately solving the 3D geometric relationships have always been difficult problems in the field of computer vision. erefore, more complex geometric primitives can be selected as the data representation, such as planes (e.g., [5][6][7]) or lines (e.g., [8][9][10]). By analyzing the pinhole camera model, epipolar geometry, and various line segment detection algorithms, it is found that 3D reconstruction based on line matching is feasible. In addition, the surrounding artificial buildings have prominent line segment geometric features. If the relevant 3D information is extracted and matched, the 3D reconstruction efficiency can be enhanced. e unique 3D information can be used to extract and match the related 3D information using the unique geometrical features of the line segments. ese line segments can be obtained by any line segment detector. e two most commonly used line detection algorithms are LSD [11] and EDL [12]. Both algorithms can provide accurate and almost abnormal detection in a very effective way. Figure 1 shows an example image with line segments obtained using the LSD algorithm.
e literature shows that the 3D reconstruction technology has evolved from a point-based motion recovery structure algorithm to a line-based multiview stereo vision algorithm, but each algorithm has its own advantages and disadvantages. Currently, how to obtain a high-precision 3D scene model is also the focus of 3D reconstruction research. Since the structure of the scene is usually complicated and of large scale, obtaining a high-precision 3D model is still a problem that deserves attention and requires significant amount of resources, energy investment, and technical research.

Related Work
e point-based motion recovery structure algorithm relies heavily on the unique textures in the scene and appears weak when facing some monotonous environments. Although, the SFM algorithm strives to create sufficient feature matches in order to successfully calculate the correct camera pose, the 3D models generated are usually very sparse. Since the linear characteristics of the artificial building environment are very obvious and the line segment is the most common geometric feature in the artificial building environment, it would be a good choice to complete the feature extraction and matching using the straight-line segment feature.
Bay et al. [13] used line segments from two uncalibrated images to determine the relative camera poses and to compute a piecewise planar 3D model. However, this method is not suitable for processing more than two images, is not robust when dealing with unstable lighting conditions, and is unable to handle some outdoor scenes.
Further, Schindleret et al. [14] incorporated the Manhattan-world assumption into the reconstruction procedure to decrease the computational complexity and to reconstruct buildings from two views.
In 2010, Jain et al. [15] proposed a method of reconstructing lines from multiple different stereo images. e method does not require the correspondence of line segments in different images. e method independently reconstructs the line segments using connectivity constraints and then calculates the final 3D model by merging. Although this method has achieved good visualization, it is not suitable for large-scale datasets.
In 2014, Micusik and Wildenauer [16] proposed a SLAM-like system with line matching through narrow baselines and showed impressive results, especially for indoor scenes. However, this method only attempts to estimate the camera pose estimation and 3D reconstruction through line segments is extremely difficult.
Hofer et al. [17] proposed a public line-based 3D reconstruction tool, called Line3D++. e method first establishes a large set of potential line correspondences between images through weak epipolar constraints and uses a scoring formulation based on mutual support to separate correct matches from incorrect matches for each segment.
e final line-based 3D model is obtained by clustering the 2D segments from different views using effective graph-clustering formulas. However, the 3D reconstruction results of Line3D++ method will lose part of the line structure mainly because the line detection result of the line matching step is not located at the true edge of the image, and there is no consistency check of the matching line pair.
In order to solve this problem, this paper first corrects the LSD line detection results produced by Line3D++ and then uses the epipolar constraint principle to eliminate the mismatched lines. Finally, an accurate and complete 3D reconstruction result is obtained.

Correct Line Position
Let the image be I. Use the Canny operator [18] to perform edge detection on I and obtain the edge map E. Solve the gradient map G by E, using the form of first-order difference: where (i, j) are the coordinates of the pixel, while d x (i, j) and d y (i, j) are the first-order partial derivatives of x and y directions, respectively. Figure 2 shows the construction process of the extended gradient map, where Figure 2 e gradient direction of the surrounding pixels of the edge pixels is cleared and then all of the surrounding pixels are regarded as the edge pixels. At this time, the gradient is calculated again as shown in Figure 2(c). e above process is repeated until all the pixels are traversed, and the final result is obtained as shown in Figure 2   (1) e first problem exists in the local area of the corner. If D(p 1 , e 1 ) < D(p 1 , e 2 ), the correction result is p 1 ∈ e 1 and p 2 ∈ e 2 . In this case, the gradient gravitational map correction will obtain the incorrect correction result. In fact, p 1 should satisfy p 1 ∈ e 2 . (2) e second problem is at the abutting edge. If D(p 3 , e 2 ) < D(p 3 , e 3 ), the correction result is p 3 ∈ e 2 and p 4 ∈ e 3 . In this case, incorrect correction result will be obtained by the gradient gravitational map correction. In fact, p 3 should satisfy p 3 ∈ e 3 .
In order to address the aforementioned problems, this paper proposes the following straight-line correction method.
In Figure 5, the blue line is the correct edge position. Take the yellow straight-line segment l as an example. e two endpoints of yellow line are P and Q. e yellow line is divided into n equal parts, and the equal points are recorded as E 1 , E 2 , . . . , E n−1 . e corrected positions P ′ and Q ′ of the endpoints P and Q, respectively, and the corrected positions E 1 ′ , E 2 ′ , . . . , E n−1 ′ of the bisector are calculated by the gradient gravitational map.
Let K be the slope of the line segment. Calculate the slope K(P ′ E 1 ′ ), · · ·, K(E n−1 ′ Q ′ ) of each small line segment after correction. en, sort the sequence in ascending order to obtain the new sequence K 1 < K 2 < · · · < K n . Take out the middle w consecutive slopes K [(n−w/2)]+1 , K [(n−w/2)]+2 , · · ·, K [(n−w/2)]+w and calculate the standard deviation as where K � w i�1 K [(n−w/2)]+i /w . Set the threshold ε; if σ < ε, then add K [(n−w/2)] , · · ·, K 1 from left to right to the aforementioned w consecutive slopes and calculate a new standard deviation σ ′ for each addition and check the condition σ ′ < ε. If condition is true, continue to join the next one and repeat the above operation. Otherwise, stop joining and remove the current added slope. Perform same operations on K [(n−w/2)]+w+1 , · · ·, K n .
At this time, the line segment that holds the slope is saved with a total of s. e total number of endpoints is 2s, and the connectivity is determined by the number of occurrences of the same endpoint. Let N(x) denote the number of occurrences of endpoint x. If there are two endpoints satisfying N(x) � 1 and there are s − 1 endpoints satisfying N(x) � 2, then all the saved small line segments are connected, and the two points satisfying N(x) � 1 are taken as new endpoint pairs. It is extended to maintain the same length as the original straight-line segment. e extension method is as follows. Assume that the distance between the two ends p and q of the line segment l in Figure 6 is shortened by d 1 and d 2 , respectively, and p 11 and q 11 are used as the starting nodes. en, the extended line vector is calculated as where Vt is a calculation vector function, while V p 11 and V q 11 are determined to obtain a new pair of endpoints of the corrected straight-line segment. e line matching results are refined. e polar line in the adjacent three images is calculated using the point feature matching result. en, the matching results are combined to determine the final verified local feature area and random sampling is used to verify the feature similarity in the small neighborhood. us, the incorrect matching line features are eliminated. If the matching line exists only in two adjacent images, the above method is only performed in   two adjacent images. Figure 7 shows the process of determining small neighborhoods by combining epipolar lines. In Figure 7, take three adjacent images I 1 , I 2 , and I 3 as examples. e blue lines are the polar lines, while the yellow and the black lines represent one of the polar lines and the corresponding matching line, respectively. In this paper, the polar and the corresponding matching lines are used as reference to obtain the small neighborhood A, B, C, A ′ , B ′ , C ′ , A ″ , B ″ , C ″ , around the line, that is, the areas surrounded by the green lines. e specific solution method is as follows.
Firstly, the bisection points a ′ , b ′ , c ′ , . . . are obtained using the bisection of line segment L ′ . en, according to epipolar constraint, the corresponding polar lines of a ′ , b ′ , c ′ , . . . in Figure 7 . . , respectively. In order to speed up the calculation, the regional similarity can be calculated by random sampling.
Secondly, the size of each small neighborhood is determined. In the experiment, let the radius threshold be R, and save the pixels smaller than the radius threshold in a new image I m . Finally, the gradient direction determines whether the neighborhood is the left or the right neighborhood to which the pixel belonging to I m is pointing. In order to maintain the consistency of the direction, the area pointed by the straight-line gradient represents the right neighborhood and the other side is the left neighborhood. e similarity of the line neighborhood is determined by calculating the similarity of the pixel colors within the region. Let the corresponding matching areas of L, L′, and L ″ be NE, NE ′ , and NE ″ , respectively. e number of pixels in areas NE, NE ′ , and NE ″ is m, m ′ , and m ″ , respectively. e neighborhood similarities of NE ′ -NE, NE ′ -NE ″ , and NE-NE ″ are calculated. e neighborhood similarity between NE ′ and NE is e neighborhood similarities between NE ′ -NE ″ and NE-NE ″ are calculated in similar manner as equation (4). e calculation is performed using the above method. If the corresponding regions in the three images are randomly selected to have high similarity, it is determined that the matching straight line is correct. In this way, the straight line is refined, and all three adjacent images in the dataset are refined in order to obtain the line matching results.
Construction of gradient gravitational map is shown in Algorithm 1. e method of refining the line matching result is shown in Algorithm 2.

Experiments
For the experiments, 3.2 GHz CPU, 16G RAM, and a Nvidia GeForce GTX 1060 6 GB GPU were used. e proposed algorithm was implemented in C++ and (optionally) also in CUDA.
In the experiment, the local neighborhood radius R was set to 5 pixel values, the number of samples K was 30% of all  Mathematical Problems in Engineering small neighborhoods, the similarity threshold T was set to 10, and the proportional threshold P was set to 90%. In this paper, the effectiveness of the proposed method is illustrated by three parts of comparative experiments. e first shows results of line matching and purification. e second compares the sparse 3D point cloud model with the experimental results of the proposed method. e third part compares the results of Line3D++ with the experimental results of the proposed method. Figures 8-13 show the experiment results. Figures 8 and 9 show results of line matching and purification. Figures 8(a) and 9(a) show the direct match results with LSD. Figures 8(b) and 9(b) show the results of the straight line correction. Figures 8(c) and 9(c) show the results of the purification matching.
In Figure 8, Figure 8 is situation can easily lead to incorrect matching results. e main reason is that the line of the edge position detected by the LSD is offset to the middle of the side plane. However, the proposed method can solve this problem. Fill in GM with E and EGM according to the above method. (4) Calculate the gravitational value at each location: if gradient direction � � 45°d raw a circle centered on the current position, and according to the edge pixels that intersect first to fill in GM (1, i + kx, j + ky) and GM (2, i + kx, j + ky); else according to the EGM (i + kx + ky); } } } (5) Return GM ALGORITHM 1: Steps of constructing gradient gravitational map. In Figure 9, Figure 9(b) eliminates lines 23, 5, 4, 3, 1, 8, and 26 in the matching result compared with Figure 9(a). Compared with Figure 9(b), Figure 9(c) removes line 1. Among all the matched line pairs that are removed, the line pairs with obvious errors are lines 2 and 26. Since they are not located at the true edges of the image, matching errors are caused. When the corrected result is used as an input for matching, a matching result with higher accuracy can be obtained. In addition, lines 1 and 5 in Figure 9(a) are obviously located at different edges. After being processed by the method proposed in this paper, this kind of error can be effectively reduced. Figures 10 and 11 show a comparison between the sparse and the line-based 3D models. Figures 10 and 11 show the sparse 3D point cloud models of the scene obtained by processing the image set using the SFM algorithm and the line-based 3D reconstruction results, respectively.
It can be seen from the comparison that the 3D sparse point cloud model is able to represent the characteristics of the building, but the overall structure is not very obvious. On the other hand, although the 3D line segment model is very vague at some curve features, the lines of the building are very obvious. Using the 3D line segment model can more prominently represent the geometric structure of the building, especially for buildings with straight lines and few curves. In addition, the 3D point cloud model has poor reconstruction accuracy in the absence of textures, and the obtained 3D models often appear hollow.
e 3D line segment model can provide more sufficient structural information and better reflect the geometric topology of the scene compared with the 3D point cloud model. us, the 3D line segment model provides a highly meaningful semantic 3D information for reconstruction. Figures 12 and 13 compare the Line3D ++ results with the experimental results of the proposed method. It can be seen from the figures that the 3D reconstruction results of Line3D++ lack many lines. Compared with Line3D++, the proposed improved method has more lines and more local structures, which significantly improve the integrity of the object. e experimental results can be analyzed from different angles and different structures. For example, Figure 12(c) shows the position of the door, and the reconstructed line segment is very sparse. e line segment reconstructed in Figure 12(d) is relatively dense, and many important lines are restored, making the structure of the door clearly visible. Figure 12(e) shows the position of the brick wall, but most of the reconstructed lines are vertical lines. ere are only few horizontal lines, and it is impossible to see what structure is reconstructed. In contrast, Figure 12(f ) restores relatively large number of horizontal lines, making the line segment and the outline of the brick clearer. Figure 13(c) shows a partial enlargement of the stepped portion. It can be seen that only a few lines have been reconstructed, while Figure 13(d) shows that the step is reconstructed properly due to the extremely high degree of reconstruction integrity of the proposed method. Figure 13(e) shows a partial area of a part of the wall. Obviously, Figure 13(f) has more lines and better results. Table 1 shows the number of feature lines of the Line3D ++ and the method proposed in this paper for two datasets. For the Castle dataset, the number of feature lines in Line3D ++ and the proposed method is 1,590 and 1,676, respectively. For the Herz-Jesu dataset, the number of feature lines in Line3D ++ and the proposed method is 1704 and 2394, respectively. e number of feature lines in both datasets increased significantly. Table 2 shows the RMSE of the Line3D ++ and the method proposed in this paper for two datasets. As we can see, the proposed method has a slightly higher accuracy. Reconstruction of the 3D line segment model was conducted for two sets of classical datasets. By comparing the experimental results and the experimental data in Table 1 and 2 before and after using the proposed method, it can be seen that the proposed method effectively solves the defects of insufficient accuracy and visual effects, for example, there are too many stray lines, and some areas cannot restore the characteristics of buildings. Moreover, the proposed method can improve the matching accuracy, produce detailed model outline, and provide high 3D reconstruction efficiency.

Conclusion
is paper presents a linear correction and matching method for 3D reconstruction of target line structure and resolves the mismatching problem in the line matching step in the Line3D++ algorithm. e gradient map is extended to construct the gradient gravitational map in order to correct the position of the straight-line segment detected by the straight-line extraction method. e epipolar constraint is used to eliminate the mismatched straight lines in order to improve the quality of the 3D reconstruction. e experimental results demonstrate and validate that the 3D reconstruction results obtained by the proposed method are more accurate and complete than the Line3D++.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.