Modelling Forest Aboveground Biomass Based on GF-3 Dual-Polarized and WorldView-3 Data: A Case Study in Datong National Wetland Park, China

School of Spatial Information and Geomatics, Anhui University of Science and Technology, Huainan 232001, China Coal Industry Engineering Research Center of Collaborative Monitoring of Mining Area’s Environment and Disasters, Huainan 232001, China Postdoctoral Station of Geological Resources and Geological Engineering, Anhui University of Science and Technology, Huainan 232001, China Postdoctoral Working Station, Anhui Huayin Mechanical and Electrical Co., Ltd., Huainan 232001, China


Introduction
Wetlands are important ecosystems that exist between terrestrial and aquatic systems [1,2]. ey provide many ecosystem services such as regulation of the hydrological cycle, maintenance of water quality, conservation of biological diversity, and natural and socially beneficial services. As an important part of urban ecosystems, urban wetlands can improve the urban climate, enhance environmental quality, increase biodiversity, and conserve water [3][4][5]. However, they are strongly disturbed by human activities, such as excessive urban sprawl, and air and water pollution [6], particularly, in developing countries, where basic human needs have not been met. Aboveground biomass (AGB) is a parameter that can be used to evaluate the patterns, processes, and dynamics of carbon cycling in ecosystems at local, regional, and global scales. erefore, AGB estimation at the regional and global scales plays an important role in the context of promoting international wetland protection and other initiatives under the framework of the Convention for Biological Diversity [7]. To assess the value of wetlands in providing ecosystem services, it is essential to build a model that can estimate AGB.
is can also help to determine the status of an ecosystem and inform scientific forest management.
In the development of new approaches to estimate AGB, efficiency and cost are critical factors. Line transect and distance sampling research methods are commonly used, which build statistical models based on field observations of tree height and diameter at breast height (DBH) [8]. An alternative to gathering the large number of samples required for interpolation at the regional scale, remote-sensing data have been increasingly used to estimate wetland AGB in recent decades [9,10]. Biomass estimation and long-term biomass monitoring can be achieved using optical remote sensing, which obtains signals reflected from the forest canopy to extract vegetation parameters that have a significant response to biomass [11,12].
Various vegetation indices derived from optical satellite images have been used to determine vegetation chlorophyll and canopy structure parameters [13][14][15][16][17]. However, limitations remain due to cloud coverage and the structural heterogeneity of the vegetation canopy [18]. Moreover, due to the limitations of optical remote sensing in detecting vertical distributions and saturation phenomena, the sensitivity of vegetation indexes to biomass change is low. Another effective way to estimate AGB is to apply light detection and ranging (LiDAR). It is mounted on an aircraft and can provide detailed vegetation structure measurements as a point cloud, providing accurate AGB estimates without saturation at high biomass levels [19,20]. However, it cannot satisfy the needs of large-scale applications due to its low efficiency and high cost.
Synthetic aperture radar (SAR) is an active remotesensing technology that has developed rapidly in recent years. It has a multipolarization ability that can describe the scattering mechanism of vegetation. Many studies have shown that SAR data provide unique and valuable information on wetland biophysical parameters by exploiting the particular sensitivity of radar backscatter signals to AGB [21][22][23]. e primary SAR-based AGB estimation method commonly involves a regression analysis between polarized microwave backscatter data and ground data obtained from field plots. However, a few problems in estimating AGB using the relationship between observed biomass and SAR backscattering remain because of saturation at high biomass levels and sensitivity to soil conditions [24].
To overcome the drawbacks associated with using a single data type and to assess the application ability for AGB estimation of the new data source of new Chinese GF-3 satellite, biomass estimates based on a combination of SAR and optical data have begun to be implemented. A few studies have also explored the potential of combining SAR, optical, and/or LiDAR data for estimating and mapping wetland ecosystems [25][26][27][28][29]. e accuracy increases when the model is further improved using several datasets from different sensors simultaneously [30][31][32][33]. With the development of high-resolution optical remote-sensing research and applications, the combination of SAR and high-resolution optical images have been used for AGB estimation [31,34].
ere are many wetlands in China that covered an area of approximately 66 million hectares in 2016, which is at the forefront of the world [35]. Developing improved models to accurately estimate wetland biomass has become a focus of research. us, combining high-spatial-resolution data from the WorldView-3 satellite with SAR data from the new Chinese GF-3 satellite, an enhanced approach to biomass estimation in urban wetlands was developed. It was applied to the Datong Wetland, which is a typical area of postmining ecological restoration. e objectives of the study were as follows: (1) To apply improved forest AGB estimation models to an ecological rehabilitation area using remotesensing-derived data and ground-measured data (2) Figure 1). e park covers an area of approximately 14.85 ha and is a constructed wetland and scenic site. Datong wetland was built on an abandoned coal mine for ecological restoration from 2004 to 2010 [36]. e elevation rises from 30 m to 200 m with average annual precipitation of 471.9-1428.3 mm. Meanwhile, the annual mean temperature is about 15.3°C, with the highest temperature reaching 41.2°C in summer and the lowest reaching −22.2°C in winter. e main forest communities include Metasequoia glyptostroboides, Populus adenopoda Maxim, and Quercus acutissima Caruth.

Field Sampling.
To build a model based on the relationship between field-measured and remote-sensing data and to test its performance, a field study was required. Hence, a total of 60 plots with dimensions of 10 m × 10 m were established in September 2016 and May 2017 ( Figure 2).
According to forest types, topographic features, and transportation accessibility, the samples were evenly arranged across the region. For each sample, we recorded the geographic coordinates, diameter at breast height (DBH), and height, volume, and species information for trees with DBH > 5 cm in September 2015 and May 2016. Each tree was tagged and assessed as either alive or dead. e central points of all 60 samples, among which 70% were used to build the models and 30% were used for accuracy assessment, were measured by a global positioning system (GPS; Garmin MAP 62CS; accuracy: ±3 m). According to appearance, the forests were divided into coniferous forest (C), broad-leaved forest (B), or coniferous and deciduous broad-leaved mixed forest (M). en, we calculated the biomass indexes on the basis of observed data, including average DBH, tree height, and stem density in each sample (Table 1).
On the basis of the DBH and tree height of individuals in the sample, AGB was calculated using the continuous biomass expansion factor (BEF) method [37][38][39][40]. Combined     with the records of DBH and tree height, we calculated the volumes of all individual trees using a volume table system and summed the values of each sample. en, a series of mathematical models were built in accordance with regression analysis to seek the relationship between biomass (B) and total volume (V). e model can be expressed as where B and V are the stand biomass and stand volume of the sample, respectively, and a and b are constants for specific forest types that have been listed for different forest types.

SAR Data.
e SAR data were sourced from the Gaofen-3 satellite, which is China's first solar-synchronous C-band multipolar SAR satellite. It has a maximum spatial resolution of 1 m [41]. It has 12 imaging modes, including focusing beam, fully polarized stripe, and wave mode, which is more than any other SAR satellite (Table 2). e data used were of Gaofen-3 data with two polarizations, which were acquired on 28 December 2017 and consisted of single-look complex (SLC) images of HH and HV with a spatial resolution of 5 m in slant range and an incidence angle between 20°and 50°.

Multispectral Images.
Multispectral images were captured by the WorldView-3 satellite on 1 May 2016 during good weather and clear skies. In addition to panchromatic images with 0.31 m resolution and 8-band multispectral images, WorldView-3 also provides 8-band short-wave infrared images with 1.24 m resolution and 12 CAVIS (clouds, aerosols, vapour, ice, and snow) band images [42]. In this study, WorldView-3 image with UTM projection and WGS 84 coordinate system were used to derive vegetation indexes and, furthermore, to modelling AGB with observed forest biomass data. e image consist of 4 bands, including red, green, blue, and near infrared, and the size were 495 lines × 415 pixels at the nadir, with 16-bit data. Figure 3 provides an overview of the methodology, including data preprocessing, parameter extraction, and model building.

Data Preprocessing.
e first step of the research was to preprocess the WorldView-3 and GF-3 images. Preprocessing of GF-3 data was implemented using PIE-SAR software. e GF-3 images were first multilooked to reduce speckling and generate square pixels using factors of 1 and 3. en, they were speckle-filtered using the Lee refined filter. Geometry-induced distortions, which caused by several terrains and SAR parameter interactions, were removed using ASTER digital elevation model (DEM) data. Sloped terrain can disperse the radiation from radar backscattering [24]. erefore, it is essential to correct this through radiometric calibration to make the image represent the backscattering characteristics of the ground target. e first step for preprocessing the WorldView-3 images was the implementation of geometric correction by using ground control points obtained from topographic maps. en, radiometric correction, which can change the digital number (DN) values to the apparent reflectance, was conducted with the 6S model. In order to make the spatial resolution of WorldView-3 and GF-3 uniform, WorldView-3 was resampled before parameter extraction to a pixel size of 8 m × 8 m.

Parameter Extraction and Principal Component
Analysis. To improve the AGB estimation models, vegetation indexes and their principal components obtained by the principal component analysis (PCA) were selected. ese are commonly used to characterize vegetation and distinguish between tree canopy and cultivated land/grassland [43][44][45][46]. ese vegetation indexes include the normalized difference vegetation index (NDVI), the relative vigour index (RVI), the difference vegetation index (DVI), and renormalized difference vegetation index (RDVI) [47]. eir equations are where R and NIR are the reflectances of the red and nearinfrared bands, respectively. Since a large number of vegetation indexes were calculated, PCA was used, which can reduce the dimensionality and simplify the data structure. We regard these vegetation indexes as m random variables, which are denoted as X 1 , X 2 , . . ., X p . e PCA transformed these vegetation indexes into m new indicators named F 1 , F 2 , . . ., F m (m < p), which are independent of each other and fully reflect the information of the original indicator according to the principle of preserving the main information. Commonly, PCA seeks a linear combination F i of the original indicators. us, the results of these indexes were transformed using linear formulas. e equation is F 1 � a 11 X 1 + a 21 X 2 + · · · + a p1 X p � α 1 ′ X, F 2 � a 12 X 1 + a 22 X 2 + · · · + a p2 X p � α 2 ′ X, where a pp stands the transform coefficient of each variable.
As for the results of PCA, the PC1 contributed the maximum rate, indicating that it has concentrated most of the characteristics of the four indicators. en, the index in the PC1 with the maximum coefficient was selected as the modelling parameters.

Model
Building. Based on the six common functional models and the single variables, we constructed the regression model using optical vegetation indexes, backscattering coefficients, and observed biomass data. A total of 6 functions were applied to build the model, including (1) linear function, (2) multivariable linear regression, (3) exponential function, (4) power function, (5) logarithmic function, and (6) growth function.
In addition, the absolute and relative root mean square errors (RMSEs) were calculated to assess the precision of the different models. It can be expressed as where B o indicates the observed biomass B e is the estimated biomass and n is the number of samples.

Vegetation Index-Based
Model. e variables of the models include each type of derived vegetation index. All the models were summarized and their precision assessed by R 2values (coefficient of determination) and F-tests (Table 3). en, the best-fitting equation for each variable, including the F-value and R 2 -value, were applied to test the model.
Apart from NDVI, most of the variables had a poor fit and explained <53% of the variance. To obtain AGB regression models from vegetation indices, the models based on an F threshold of 7.09 and significance level of 0.05 are shown in Table 3. e results indicate that the indexes of RVI, DVI, and RDVI produced relative RMSEs of 34.80%, 31.9%, and 32.73%, respectively.
e NDVI-based model was more relevant to AGB than the others. NDVI was moderately correlated with the observed biomass, with R 2 of    (Table 3). However, in contrast to the simple one-factor of the vegetation index-based model, the combined vegetation model using NDVI and DVI provided a poor explanation for the percentage variance (RMSE ≥ 75%), possibly due to the significant correlation between vegetation indices causing interference for linear relationships. us, the best vegetation index model for AGB estimation using NDVI as a variable was the following: where AGB is predicted biomass and NDVI is the pixel value of the parameter maps. e variance analysis showed that the NDVI-based model offered a significant enhancement in the estimates and led to a lower relative RMSE compared to the application of other single vegetation indices or combinations.

Backscatter Coefficient-Based Model.
We used similar methods to build the models on the basis of the backscatter coefficient. e variables used were HV or HH polarized microwave backscatter data. e regression coefficients of these models were 0.823, 0.836, and 0.81, respectively, which can be used to explain the fitting accuracy. Table 3 indicates that the best-fitting linear correlations were observed with the HV variables extracted from GF-3 data when fitting by using the HV backscatter coefficient-based functions. e model can be expressed as follows: where AGB is predicted biomass and HV is radar backscatter coefficient data derived from Gaofen-3 data.
ere was a remarkably correlation (P < 0.001) between estimates based on the backscatter coefficient and observed biomass, which could explain about 89% of the variance and showed a lower RMSE than 30.87 Mg/ha (relative RMSE � 26.72%).

Combined Models Using Vegetation Indexes and the Backscatter Coefficient.
Although modelling of AGB based on the backscatter coefficient and vegetation indexes provided fitting accuracy of >70% to their respective individual values, the combined models were built to improve the accuracy further. Moreover, the PCA components for these parameters were also performed and the resulted show that NDVI and backscatter coefficient of HV made the biggest contribution. erefore, they were selected to achieve multivariate regression. We established one model to improve the estimation accuracy. e model can be expressed as follows: where AGB is predicted biomass, HV is the radar backscatter coefficient derived from Gaofen-3 data, and NDVI is the pixel value of the parameter maps. As mentioned above, the model that combined the vegetation indices and backscatter coefficient provided a standardized regression coefficient of 0.861, which can explain 74% of the variance and led to a relative RMSE of 19.36%. e results show that the model combining the vegetation index with the backscatter coefficient fitted the observed data best when using the combination of NDVI and HV backscatter coefficient.

Selecting the Best Model and Mapping the AGB.
e evaluation of the models (Table 3) shows the following results. For the four vegetation indices, the NDVI-based models explained approximately 68% of the variance and led to a relative RMSE of 30.12%; the RDVI model was correlated with the observed biomass with an R 2 of 0.73 and relative RMSE of 32.73% (Figure 4). e other two vegetation-based models (DVI and RVI) had poor fits and had low R 2 -values of <0.5. e variance analysis shows that a  significant enhancement in the predicted values was observed with the NDVI-based model compared to models using other single vegetation indices or combinations. When compared with the VI-based model, the NDVI-based model showed the maximum F-value and minimum RMSE and R 2values of all the VI-based models. e AGB values estimated by these models versus the observed AGB values are plotted as a scatter diagram and with linear correlations in Figure 4. Figure 4 also suggests that the backscatter coefficient models yielded a significant improvement in biomass estimation compared to those using vegetation index parameters. As for AGB estimation models derived from GF-3 data, the HH-and HV-based models were strongly related to the observed values, with R 2 -values of 0.823 and 0.836, respectively. Meanwhile, the combined model showed a lower correlation than the single variable with observed AGB, which could explain about 75% of the variance and resulted in a relative RMSE of 25.58% (Figure 4). e probable cause is that the optical sensor mainly collects surface reflections, while SAR backscatter provides vegetation structure information and is more sensitive to biomass estimation.
Accuracy of 0.66-0.81 was achieved using the mathematical model based on single variables derived from the GF-3 and WorldView-3 data (Table 3). us, it is possible to combine a few variables to obtain improved accuracy. Based on this consideration, we attempted to verify whether the combined variables could enhance the modelling precision. Regarding the combined models, using NDVI and the HV backscatter coefficients of the GF-3 data as variables provided the best correlation with observed values. e combined model was significantly correlated with the observed values with an R 2 of 0.861, which made it the highest correlation with AGB and had a relative RMSE of 19.36% (Figure 4). In order to show the performance of each model, the errors between the estimated and observed values were calculated based on the verification dataset ( Figure 5). e results indicate that the largest median was observed in model 3, which meant it had the most significant average error. Meanwhile, model 1 had the narrowest error range of 2. 36-8.62 Mg/ha, while model 5 had the widest range (0.55-25.53 Mg/ha), which meant their error distributions were concentrated and scattered, respectively. Finally, the model with the highest R 2 -and F-values was chosen as the best model derived from the aforementioned tests. en, AGB prediction maps were generated for Datong wetland ( Figure 6).

Revelations and Limitations of the AGB Estimation
Method. Despite empirical models having limitations of experience, instability, nonversatility, and poor adaptability, they remain one of the main approaches to biomass modelling because of the difficulties in obtaining physical parameters such as aerosol optical depth and leaf area density. Accordingly, we proposed a collaborative observation based on World View high-resolution optical imaging and new SAR data from the Gaofen-3 satellite. is was possible due to improvements in the ability to acquire high-resolution remote-sensing data and the expanded demand for refined observations. e proposed approach can improve the precision of biomass estimation and provide references for similar studies.
Of the four models based on different vegetation indices, the most significant correlation with measured AGB was provided by the NDVI-based model. However, although NDVI-based models showed higher accuracy than others (lower RMSE and F-values and greater R 2 ), their accuracy is lower than backscatter and combined models. erefore, the result suggests that NDVI is saturated and its sensitivity is reduced in densely vegetated areas. In fact, the red band is absorbed, particularly, in high-density forests, yet the nearinfrared band progressively increases because of multiple scattering effects [48]. Although vegetation index-based MLR is a common approach to building models, it has some limitations. Significant correlations among vegetation indices cause interference that obscures linear relationships, and not all vegetation indices are linearly correlated with biomass. us, adding additional variables does not always improve accuracy. Furthermore, with increases in wavelength, correlations with biomass become more significant in dense vegetation than the standard NDVI [49]. e application of backscattering coefficients is a common and effective way to build biomass models [50][51][52], as confirmed by our results. To reduce the influences of soil and surface moisture, selecting the seasons with little rainfall can help enhance data quality [53]. As for AGB regression models based on SAR data, they all performed better than the models based on vegetation indices in terms of greater R 2 -and F-values and lower RMSEs (Table 3 and Figure 4). Compared with the HH backscatter coefficient-based model, the regression model with the HV backscatter coefficient showed higher accuracy (higher R 2 -and F-values and lower RMSE). e multivariate linear regression model with the combination of HH and HV data provided greater accuracy than the model based on single-polarized data (Table 3).
Although the single variables of NDVI and HV polarized were highly correlated with field-measured biomass, the accuracy of the multivariate linear regression model could be increased to 83% by including HV polarized data in the PCA. e combined biomass model had dramatically better performance than those based on single-polarized data or vegetation indices. is improvement is mostly due to the WorldView-3 data containing surface information on tree crowns and the GaoFen-3 data containing information on forests based on backscattering from the forest structure. e modelling accuracy was highly dependent on the amount and quality of remote-sensing and field data used. e results of the accuracy assessment show that there was uncertainty in the AGB estimation accuracy and AGB distribution, such that the estimated results do not entirely correspond with the data from the 60 field sampling sites.
us, using more samples may improve estimation accuracy. Furthermore, this study was conducted in an area characterized by its spatial distribution. Since the study was conducted using empirical models, the high accuracy of AGB estimation obtained in this study may require further validation in other areas and in different seasons.
In future, biomass remote-sensing estimation could be further improved by adjusting the data sources and methods used. From the perspective of data sources, because LIDAR technology can accurately describe the three-dimensional structure of forests and is highly correlated with observed biophysical parameters of vegetation, such as biomass, it has profound application prospects for mapping and estimating continuous changes in regional biomass. However, since LIDAR signals are often affected by noise or interference, it is essential to enhance the point cloud when modelling based on LIDAR data [54]. From the perspective of methods, machine learning methods are an important development trend that has the potential to effectively combine data from different sensors with different modelling algorithms to improve the accuracy of biomass estimation.

Possible Influences on AGB Distribution.
In this research, a total of 60 samples from broad-leaved, coniferous, and mixed forests (Table 1) were used to analyse the impacts of forest type and structure on AGB estimates. e accuracy of AGB estimation depends not only on spectral reflectance differences but also on the forest structure.
Most of the broad-leaved stands were secondary forest planted for the restoration of coal mining wasteland in the study area. e RMSEs of these stands were higher than those of other stands due to their disordered structures and lower AGBs. On the contrary, artificial coniferous forest had a regular spatial structure that reduced the random scattering from the canopy and strengthened the single reflections of SAR. Large broad-leaved trees dominate mixed forests, while coniferous trees are distributed in the understory. Because of the complex spatial structure and high canopy density of these trees, scattering was characterized by a high degree of randomness and, thus, the lowest estimated error.

Conclusions
is study explored the potential of using WorldView-3 and Chinese Gaofen-3 data to model AGB in an urban wetland. e main conclusions are as follows: (1) e new data obtainable from Gaofen-3 images are valuable for estimating and mapping AGB. e correlation between the backscattering coefficient of HV polarization and biomass was higher than that of HH polarization, indicating that HV polarization is more suitable for biomass estimation.
(2) Of the simple one-factor models, the exponential model had the best fit, whether it was based on a vegetation index or the SAR backscattering coefficient. (3) e reliability of the linear combined model based on HV and NDVI data was demonstrated by its high overall accuracy (F � 7.9, R 2 � 0.861) at fine scales using small forest plots with considerable biomass densities based on Gaofen-3 and WorldView-3 data. is indicates that a combination of optical and microwave remote-sensing images can effectively improve the accuracy of AGB estimation.

Data Availability
e result data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare no conflict of interest.