3 D Reconstruction of Tree-Crown Based on the UAV Aerial Images

The algorithm for 3D reconstruction of tree-crown is presented with the UAV aerial images from a mountainous area in China. Considering the fact that the aerial images consist of little tree-crown texture and contour information, a feature area extraction method is proposed based on watershed segmentation, and the local area correlation coefficient is calculated to match the feature areas, in order to fully extract the characteristics that can reflect the structure of tree-crown. Then, the depth of feature points is calculated using the stereo vision theory. Finally, the L-system theory is applied to construct the 3Dmodel of tree.The experiments are conducted with the tree-crown images from UAV aerial images manually. The experiment result showed that the method proposed in this paper can fully extract and match the feature points of tree-crown that can reconstruct the 3D model of the tree-crown correctly.


Introduction
In recent years, Unmanned Aerial Vehicle (UAV) imaging has become the irreplaceable mapping tool.We can get multiple imaging from multiangle and multiposition during a voyage of UAV [1].Processing on the UAV sequential images with some related theories and technologies of 3D reconstruction, we can obtain the 3D models target on the ground.It is obvious that the height, color, texture, and outline information of the tree-crown from the images of high altitude is fuzzy.And the jitter, offset, and rotation of the aircraft can cause the motion and rotation of the camera.Furthermore, with the influence of many factors, the structure of the tree-crown has a lot of uncertainty, and it is difficult to establish empirical knowledge [2].All these reasons make it very important theoretical guidance and practical significance to reconstruct the 3D model of treecrown from UAV aerial sequential images accurately.
In this paper, a tree-crown 3D reconstruction algorithm is proposed based on UAV aerial images, and our feature point extraction and matching method are fully considered the characteristics of the aerial images and the target to be reconstruction which can provide reference value to the future research.The modeling approach takes into account the structural characteristics of the canopy, and the full use of the existing information on more difficult issues as good canopy modeling is applied to achieve better results.
A brief review of relevant researches is presented in the second part.Then on the basis of the predecessors, the details of the algorithm are described in the third part, while in the fourth part, the experiments are demonstrated to validate the algorithm and to make a conclusion and discussion.

Related Work
In present, most researches of the 3D reconstruction are focused on the stereo vision principle to process two or more images from different angles of the same target and to reconstruct the 3D shape of the target.Sun and Bergerson [3], Liu et al. [4], and Hapca et al. [5] proposed related reconstruction method and models of tree object in this idea.With this idea in the 3D reconstruction of aerial images, the jitter and translation of UAV can be considered to improve the whole model and results.Compared with their researches concerning the backbone and structure information of tree, our research considered the detailed information of the treecrown contour and the internal texture.
Tan et al. [6] proposed a 3D tree reconstruction method to predict the obscured branches with the visible branches, and Tang et al. [7] proposed a 3D tree reconstruction method based on areas layered, so as to calculate the biological parameters of the tree.However, our method is concerned with the branches and also the characteristics of the outline.Cheng et al. [8] provided us with a good idea to determine the entire tree model with the help of determining the tree skeletons.There are many other researches on 3D tree reconstruction, such as [9][10][11], and their work gives important constructive advice to our work.Our research is more concerned with the unique features of the target tree-crown.
Although most of the researches are concerned with the backbone and structure information of tree, we consider the detailed information of the tree-crown contour and internal texture.It is very important to extract and to match the feature points of the target for 3D reconstruction.In our model and method, process is designed and conducted to fully extract the points that can reflect the structural characteristics of the target and to correctly match the feature points which directly affect the accuracy of the final results of 3D reconstruction.Commonly used methods of corresponding points matching can be classified as feature-based and region correlation coefficient based on matching method [1].
SIFT and SURF are typical matching methods based on typical features; for example, Hou et al. [12], Ciobanu and Côrte-Real [13], and Bellavia et al. [14] all used SIFT algorithm to extract feature points and to match them in 3D problems so as to achieve good results of the robustness and stability.Such algorithms have good resilience, stability and robustness for image rotation, zooming, and other changes, but these algorithms can only extract limited points when there are images with limited texture information.
Many scholars extract and match feature points based on region correlation coefficient; for instance, Wei et al. [15] put forward a method based on region correlation coefficient of gray image to match the feature points.Region-based correlation method is performed well in accuracy, but the calculate time is long.Further improvement is conducted; for example, Yang [16], Mori and Kashino [17], and Zhang et al. [18] have put forward some improvements on template matching method based on normalized correlation coefficient (NCC).These methods do not need complex preprocessing on images, and they only use the gray statistics of image itself [19].Based on the analysis above, we select the appropriate feature extraction methods and efficient scanning methods to save up the time consumption.
In addition, considering the complex structure of the tree, 3D modeling approaches are commonly based on grid, since the canopy structure characteristics of aerial images are limited.Li [20] and Shlyakhter et al. [21] used Lindenmayer system (L-system) to automatically generate models of tree, which was applied to build complex models of trees in our proposed method.
On the basis of the current researches, a tree-crown 3D reconstruction algorithm is proposed.Firstly, feature points are extracted based on the watershed to match features by calculating the local area correlation coefficient (LACC) in the RGB color space; secondly, the depth information is achieved based on the principle of binocular stereo vision; finally, L-system is applied to the 3D modeling of the treecrown.

3D Reconstruction Algorithm
The flow of the proposed algorithm in this paper is presented in Figure 1, which is divided into six parts including image acquisition, preprocessing, feature extraction and matching, camera parameter estimation, depth calculation, and 3D modeling.
To ignore the impact from other parts for the 3D reconstruction, we intercept the same size of 128 * 128 as the crown area of two successive aerial images manually, and GrabCut and BgCut [22] algorithm are applied to remove the background from tree-crown, leaving only the crown area.

Feature Extraction and Matching.
Feature extraction and matching is a crucial step in 3D reconstruction, and the accuracy of matching feature points affects the final results directly.Due to the complexity of UAV aerial images, it is difficult to match the feature points as the increased point extractions.
Based on the analysis on the tree-crown image cut from the aerial images, we found that UAV aerial images contain scarce information of a single tree relatively, and the texture information of a tree-crown is similar, or nearly no texture, while the same area of the two image has great relevance.In response to these findings, a feature point extraction and matching method is designed with four parts as: feature extraction based on watershed; region matching based on LACC; elimination of the fault matching; mean geometric registration error.

Feature Extraction Based on Watershed.
Since the two images were captured during a short interval, the illumination changes are small between the two images.And in the tree-crown images, area at the top of the trunk is relatively bright while the bottom area is relatively dark.So we use the watershed segmentation method to separate the light and dark areas in order to further extract the feature points that reflect the trunk structure of the tree 3D reconstruction.
Watershed segmentation algorithm usually takes gradient image as input, which can get the closed areas quickly.However, there is often oversegmentation phenomenon [19].In this paper, the oversegmentation of watershed segmentation algorithm is just suitable to our requirements of fully separating the light and dark areas of the tree-crown.
The classic watershed segmentation algorithm is proposed by Vincent and Soille [23], and the algorithm is calculated through two steps: the first one is the sorting process and the other one is the submerging process.First of all, the gray-level height for each pixel is sorted from low to high.And then the process of submerging is achieved from low to high.For each local minimum value in the influent domain of h-order height, the first in first out (FIFO) structure is applied to archive the judgment and to complete the labeling process.
The watershed algorithm transforms the input image to the basin image.Then the boundary of the basin is the watershed.The gradient image is calculated as the input, and the gradient value is calculated as shown in (1); wherein, the (, ) is the original image, and the grad(⋅) is the gradient function: ( The watershed segmentation method used in this paper is represented as follows: (i) Do gradient computation on the longitudinal direction with the Sobel operator.And the horizontal and vertical Sobel operator template are shown in (2), wherein  is the original image: (iii) Segment the smoothed images using the watershed segmentation algorithm as shown in

Region Matching Based on LACC.
As mentioned above, the region matching can be achieved based on the LACC values which can be calculated in the RGB color space; please see Figure 2.
The LACC in the RGB color space is calculated in wherein,   ,   , and   represent the three components , , and , respectively.And we calculate them in (6) with  as an example: wherein,  1 ( ()  , V ()  ) and  2 ( ()  , V ()  ) represent the value of  from two images, while  and  represent the half width and the half length of the sliding window.And we take the size of the rectangle circumscribed for the watershed area as the size of the sliding window.(, V) represents the mean value of  for all the pixels in the sliding window, which can be represented for the standard deviation within a sliding window as shown in Image 1 Image 2 The values of   ,   , and   are ranged in (−1, 1).If the current areas in the sliding windows from the two images are matched, the value correlation coefficient is closed to 1.

Scan area Watershed region
The ultimate target of this procedure is to get the matched feature points for the two images.In this paper, we calculate the centroid values of the matched areas as the matching points, and their coordinates are calculated in In order to reduce the time consumption, the image disparity constraints of two images are taken into account to restrict the searching area.During the matching process, the sliding window is moved only in this area, not in the whole image area.

Eliminate the Fault Matching.
In the matching procedure, only the local information of the area is considered.So we get the correct matching areas of the two images as well as some fault matched areas.In order to improve the results, we need to eliminate the fault matching results.
In this paper, we calculate the ratio of the biggest correlation coefficient value and the second coefficient value to eliminate the fault matching.According to paper [24], we find that the fault matching areas have many other matched areas with the similar correlation coefficient value.And then the ratio value will be bigger than the correct matched ones.So we set a threshold as a criterion to eliminate the matching process which has a ratio value higher than the threshold.

Mean Geometric Registration Error. Points from the two images can have a mapping relation as in
wherein  is the homograph matrix of the two images and  1 and  2 represent the points from the two images.We calculate the mean geometric registration error according to the principle of homograph as shown in Figure 3 and wherein, dist(⋅) represents the distance between the two points   and    .Thus, if the points are matched correctly, the distance between  and ' will be small, and it will result in a small .In other words, the value of  decreases when the veracity of the matching increases.

Depth Calculation.
In this paper, the whole camera model is applied.Assume there is a 3D point  = (, , )  in this projection model, and the correspondent point  = (, )  is in the image; the two points  and  have the relationship as shown in where x and X represent the homogeneous coordinates of  and ;  is the scale factor between the two sides of the homogeneous vector equation, also called the projective depth;  is the intrinsic parameter matrix; and  is the rotation matrix and  is the translation matrix.We calculate the projection equation of the two treecrown images in (12); and the equation is solved to get the world coordinates of the point in (13): The camera intrinsic parameter is the matrix which ignores the small rotation of the camera.Thus, we have the rotation matrix as We assume that the translation vectors are of image 1  1 = (0 0 0) and image 2  1 = (    0).There are some methods to estimate the parameters of the camera.With these known parameters, we can get the 3D coordinates of the feature points.

3D
Modeling.L-system is proposed by biologist Lindenmayer [25].L-system can be later developed into an effective computer graphics simulation of nature scenery, and it is a language prompting system which controls parameters of certain symbols and words.L-system's right-hand Cartesian coordinate system is defined in Figure 4, which is composed of three directions defined by three vectors as , , and  [17].Commonly used commands of L-system include 3D positioning commands, special positioning commands, mobile commands, structure commands, and increasing or decreasing commands.
After the 3D points are configured through the method described above, we apply a specific method to get the trunk information of the tree-crown according to the points, and then we use -system command to construct the trunk structure and further the whole tree with leaves.According to Figure 5, the method we used to get the trunk structure is described as follows: (i) Find the highest point and the lowest point in the  coordinate and get the difference  as the tree height.
(ii) Select the highest point on the vertical axis as the center trunk of the tree, and set its depth as 1.Add the trunk to TrunkList collection, and add the highest point to trunk points set IncludePoints.
(iii) Scan the feature points which are not in the collection IncludePoints to find the point with a minimum distance to the trunks in the collection TrunkList, and add the point into the collection IncludePoints.
(iv) Create a new trunk from the new point to its nearest trunk with 30 degrees.The depth of the new trunk plus one (the maximum depth of the tree is restricted to less than 4), and add the new trunk into the collection TrunkList.
(v) Go back to Step (iii) until all the points are included in the collection IncludePoints.

Experiment
Based on the above algorithm, two successive images are obtained during a voyage of UAV, which is manually cut for the tree-crown image with the same size as 128 * 128 as shown in Figure 6(a).In order to avoid the impact of noise outside the canopy region, we use BgCut [22] algorithm to remove image background, as shown in Figure 6(b).
For the next step, the light areas and dark areas of the tree-crown are separated on image 1 with the watershed segmentation algorithm.After obtaining the light and dark areas of the tree-crown, the numbers of pixels of each area are recorded, and the centroid of the areas is calculated as the feature points of image 1.In Figure 7, we can extract more bright points and dark points for the tree-crown, which can reflect the trunk structure of the tree.Image 1 is separated into 47 areas with 47 feature points.Then the local area correlation coefficient is calculated in the RGB color space so as to find the matched area from image 2. The rough matching result is shown in Figure 8.

Mathematical Problems in Engineering
The points of image 2 are roughly matched with points of image 1 including some fault matching inevitably.So we set different thresholds described in Section 3 to find proper results to reconstruct the tree model.
The average geometric registration errors are calculated between the matched points, and the results are shown in Table 1.We can see from the table, compared with the method proposed in paper [24], that the method we proposed can pick more feature points which are enough to construct the tree model, and these points can be matched more accurately.To take the appropriate threshold value , a better reconstruction result can be achieved.
The trunk structure information of the tree-crown is calculated according to the method described in the above section, and the feature point set and the correspondence trunk structure of the tree with different threshold values are shown in Figure 9.
As seen from the results in Figure 9, if the threshold value is taken too large, there are still some false matching points, then the remaining error will affect the 3D modeling results, and if the threshold value is taken too small, there are some correct matching points which are removed, which results  in the loss of information.So we use the threshold of 0.91, and the result is shown in Figure 10.Finally, the leaves of the skeleton model can be reconstructed for the 3D model of the target tree.

Conclusion
UAV imaging has great practical significance in forestry and land planning.However, two-dimensional images lack the necessary depth information, which will be a limitation of the UAV aerial applications.In this paper, 3D reconstruction of vision technology is proposed to obtain the depth information from the 2D images of the object with great advantages.The 3D reconstruction based on UAV aerial image can be used in many practical applications.In this paper, feature extraction and matching methods are proposed based on watershed segmentation algorithm, and local area correlation coefficient is introduced in the RGB color space, which can fully extract the feature points of the images so as to reflect the structure of the target treecrown.And reconstruction of the 3D model of the treecrown is designed based on the principle of stereoscopic vision from the limited information of the aerial images.In the proposed method in this paper, we make an effort to the camera imaging model of UAV.The method can apply to the case of aerial images taken within a small interval, and the 3D reconstruction method has applicability for tree-crown.

Figure 3 :
Figure 3: The principle of mean geometric registration error.

Figure 4 :
Figure 4: The coordinate system of L-system.

Figure 5 :
Figure 5: The construction process of the trunk structure.

Figure 7 :
Figure 7: The result of watershed on tree image 1 and the center of the watershed areas.

Figure 8 :
Figure 8: The matched points on the two images.

Figure 9 :Figure 10 :
Figure 9: The result with different threshold; (a) with threshold as 0.93, 28 feature points are selected, and the result is poor because of the remaining disturbance; (b) with the threshold as 0.92, 27 feature points are selected, and it has a similar result as 0.93; (c) with the threshold as 0.91, 23 feature points are selected, and the result has an effective improvement; (d) with the threshold as 0.90, only one feature point is lost against the result as 0.91.

Table 1 :
Mean geometric registration error.