Modeling and Mapping of Aboveground Biomass and Carbon Stock Using Sentinel-2 Imagery in Chure Region, Nepal

,


Introduction
Forests have a crucial function in reducing CO 2 levels in the atmosphere and the global climate system [1,2]. Tropical forests, in particular, store a signifcant amount of carbon in biomass and soil, i.e., 56 and 32%, respectively [3]. However, they are being removed quickly, contributing to 12-20% of all human CO 2 emissions [4]. Recognizing this possibility, the United Nations Framework Convention on Climate Change (UNFCCC) established the reducing emission from deforestation and degradation and includes the role of conservation, sustainable management of forests, and enhancement of forest carbon stock (REDD+) scheme which incentivizes developing countries to conserve and manage their forests sustainably to reduce carbon emissions and promote sustainable development. For the mechanism to work, there must be transparent, comprehensive, steady, comparable, and precise national and subnational MRV (measurement, reporting, and verifcation) systems for forests to track and assess the amount of aboveground biomass/carbon stock and carbon emitted [5].
Te aboveground biomass (AGB) is the most signifcant carbon pool in a tree; however, human activities such as deforestation can cause a reduction in forest area, resulting in deterioration of the AGB, carbon stock (CS), and CO 2 sequestration from the atmosphere. To track ecosystem responses to climate change and understand the global carbon cycle, forestry professionals, managers, and scientists need to perform terrestrial carbon accounting, which requires an assessment of AGB [6][7][8][9]. Terefore, to ensure accuracy, it is crucial to estimate biomass using a reliable method.
Te two primary methods for estimating forest biomass are conventional feld-based techniques and remote sensing (RS) methodologies. Te conventional approach to estimating forest AGB is very precise [7] but laborious, expensive, and time-consuming [7,10]. Combining RS and sample plot data to obtain spatially explicit estimates of forest AGB has become a common strategy [11]. RS techniques give an alternative to established methods of measuring biomass and CS [12]. Te ability to estimate the spatial distribution of AGB at a reasonable cost with tolerable accuracy has led researchers to accept the use of RS in height [13] and in AGB estimation [7,14,15].
RS technologies have become a key tool for addressing some of the limitations associated with feld data sampling by improving the accuracy of inventory estimates and lowering the costs of forest resource inventory and monitoring at landscape scales [16][17][18]. Te majority of RS studies make use of optical (Landsat, Sentinel 2A, and LiDAR), synthetic aperture radar (SAR, Sentinel 1), or a combination of datasets for modeling and estimating AGB [11,[19][20][21][22][23][24]. Compared to techniques using radar and optical data, Li-DAR systems have demonstrated a superior ability in forecasting and estimating AGB with greater precision [25]. However, its more extensive operational applications in estimating forest AGB in low-and middle-income countries with greater forest coverage are hindered by the data availability restrictions, high cost, and enormous data volume [26]. With the start of the European Union's Copernicus program, the global library of open access data has grown even more, with signifcant improvements in spatial, temporal, and radiometric precision [27,28]. Sentinel-2 is a polar-orbiting sensor comprised of two satellites, each of which carries an MSI with a 290 km wide swath and a multipurpose design with 13 spectral bands spanning from visible and near-infrared (NIR) wavelengths to shortwave infrared wavelengths at fne (10, 20 m) and coarse (60 m) spatial resolution. Te sensor has a high potential for mapping diferent vegetation characteristics due to the presence of four bands within the red-edge region, which are centered at 705 (band 5), 740 (band 6), 783 (band 7), and 865 nm (band 8a) [29]. Vegetation indices (VIs) are a mathematical combination of spectral bands computed by rationing, diferencing, rationing diferences, and sums by forming a linear combination of bands [30]. Many studies show that satellite-based VIs are most commonly employed for biomass modeling and estimation [21,27,[31][32][33].
AGB was estimated in tropical forest of Mexico [34], Philippines [35], and India [36], using the Sentinel-2 imagery and random forest (RF), a machine learning algorithm to select the best model. Only a small number of AGB and CS estimation studies have been carried out in tropical forests with detailed ground-based quantifcation in Nepal despite the rapid advancements in RS. As per prior to our knowledge, Chure area lacks estimation of AGB and CS using combined eforts of RS and feld ground data due to the complex topographical structure of Chure. Chure region is the youngest mountain range in the lesser Himalayas and is considered the most fragile and erodible [37]. It covers 12.8% of Nepal's land area and is home to 14% of the country's population [38]. Te high population density in the Chure region has led to increased pressure on natural resources [39], which has resulted in deforestation and land degradation. Estimating AGB accurately is critical for quantifying the amount of carbon stored in forests, which is necessary for implementing REDD+ initiatives and establishing opportunities for carbon credit trading to promote forest conservation in the region. Few studies have been conducted in Nepal for estimating AGB with the integration of RS and feld data. For example, Karna et al. [40] used WorldView-2 satellite images with small-footprint airborne LiDAR data to estimate tree carbon at the species level in a tropical forest in Nepal. Similarly, Baral Jamarkattel [41] achieved 61% accuracy when predicting the carbon content of a subtropical forest in central Nepal by integrating GeoEye and WorldView-2. A study by Koju et al. [19] used a two-scale approach for estimating AGB with optical RS images in a subtropical forest of Nepal. Kandel [20] estimates AGB and CS by integrating LiDAR, satellite images, and feld measurement. Pandit et al. [21] estimate the AGB in subtropical bufer zone community forests using Sentinel 2 data. Qazi et al. [42] compare forest AGB estimates from passive and active RS sensors over Kayar Khola watershed. Similarly, Pandit et al. [43] perform AGB estimation in the bufer zone community forest of central Nepal coupling in situ measurement with Landsat 8 satellite data. Assessment of AGB and CS in Nepal is primarily focused on the subtropical and central parts of the Nepal.
Te absence of baseline data on forest resources from the local to the regional level is one of the main obstacles to accurate forest AGB estimation in Nepal. Tis is still one of the major problems where the forest staf's ability to conduct an inventory is very limited. Te cause could be attributed to a lack of technical assistance and excessive reliance on conventional feld data collection techniques. To the best of our knowledge, this is the inaugural attempt to conduct a research study in the tropical forest of the Chure area and integrate Sentinel-2 spectrally derived indices with plot-level AGB to model and estimate AGB in Sainamaina municipality forest using modeling techniques.

Study Area.
Tis study was conducted in the Chure area of Sainamaina municipality situated in the Rupandehi district of Nepal's Lumbini province. Te geographical location of the study area is 83°15′44″E to 83°21′01″E longitude and 27°38′48″N to 27°46′05″N latitude Figure 1. Te area of Chure region of Sainamaina municipality is 9235.11 ha [44]. Te study area has typical tropical forest characteristics of undulating topography. Major species of this region are Shorea robusta, Syzygium cumini, Lagerstroemia parvifora, Mallotus philippinensis, and Anogeissus latifolia. Te research site has a varied topography, with slopes ranging from 00 to 65.740 with an elevation ranging from 60 meter above sea level to 938 meters. Te average annual precipitation in the area is 2600 mm, with 80% of the rainfall occurring during the monsoon season. In addition, the mean maximum and minimum temperatures recorded were 42.5°C and 7.5°C, respectively [45].

Sampling Strategy and Field Data Collection.
To collect the data on vegetation in the study area, the grid of 2 * 2 km was laid using ArcGIS. Each selected grid was chosen to represent the vegetation of the entire study area, as illustrated in Figure 2. Field data were collected between August and September 2021 using a circular sample plot of 500 m 2 (12.62-meter radius). A total of 72 plots (calculated by using the equation provided by the Community Forestry Inventory Guideline 2004) were sampled using a stratifed random sampling method in ArcGIS 10.5. In some instances, sample plots were relocated to new coordinates up to 50 meters away due to inaccessibility. In each plot height, the diameter of an individual tree was measured at the breast height (DBH) of 1.3 meter above the ground level, and the species with DBH ≥ 8 cm were measured [46,47]. Each tree species name was recorded in each plots.

Sentinel-2 Image Acquisition and Processing.
Te satellite image of Sentinel-2 level 1C was downloaded from the website of the European Space Agency (ESA) (https:// sentinel.esa.int/web/sentinel/sentinel-data-access), which is composed of 100 * 100 km 2 tiles dated 20 th April 2021 for the study areas with 10 m * 10 m spatial resolution and the cloud cover less than 1%. Te data are collected at 10, 20, and 60-meter spatial resolution in 13 spectral bands that span the visible, near-infrared (NIR), red-edge, and shortwave infrared (SWIR) wavelength ranges and had been previously preprocessed to refect the top of the atmosphere (TOA). Sentinel-2 level 1C was processed to level 2A to gain the bottom of the atmosphere (BOA) corrected refectance image using the ATCOR algorithm through Sen2Cor plugin in Sentinel Application Platform (SNAP) software. As the image had a spatial resolution of 10 m, it was considered adequate for the study. To ensure consistency, the image was resampled in SNAP to a spatial resolution of 10 m using the "Band 2" tile as the reference International Journal of Forestry Research band source product. Finally, the image was subset to capture the research area in ArcMap.

Field Data Analysis of AGB and CS.
For this study, a revised and improved allometric equation adopted by Chave et al. [48] especially for tropical forest trees was used for estimating AGB. Te equation by Chave et al. [48] is based on the diameter, height, and specifc density of wood (ρ) for the calculation of AGB. Data on diameter and height were directly collected from the feld, while specifc wood density is derived from various literature reviews. Species with their default (ρ) values given by Khanna and Chaturvedi [49] and Takur [50] was used to calculate tree level biomass. In the absence of specifc values, a general value (ρ � 0.674) was used [21]. Individual biomass was calculated and then aggregated to obtain individual plot-level AGB: where AGB est is AGB estimated in kilogram, D is DBH in cm, H is height in meter, ρ is wood density in g/cm 3 , and 0.0673 and 0.976 are constants. Total AGB for individual plots was computed using equation (1) and then converted to ton per hectare (t·h − 1 ). Subsequently, the amount of CS was calculated from the AGB using the conversion factor (CF) of 0.47 [51]. CS was calculated using the following equation: where CS is the carbon stock in ton, AGB is the aboveground biomass, and CF is the carbon fraction (0.47).

Deriving VIs from the Sentinel-2 Optical Satellite
Image. Te Sentinel toolbox and the Raster calculator tool in ArcMap software were used to calculate a variety of vegetation indices (VIs) from the spectral bandwidths of the Sentinel-2 image. Te selection of the specifc VIs was based on their efcacy in previous studies focused on estimating biomass. Tere are over 150 diferent VIs available, but only 12 of them were chosen for this particular study based on their performances for analyzing and predicting the AGB. Table 1 provides the formulas and authors of the 12 selected indices.  neighbor pixel values if the area of the sample plot lies in multiple pixels.

Statistical Analysis.
Statistical Package for Social Science (SPSS) and Microsoft Excel were used for statistical analysis. Te relationship between each VI and AGB was assessed using fve diferent functions, i.e., linear, logarithmic, quadratic, power, and exponential regression models. Te general forms of fve functions are given as equations (3)-(7). For all the abovementioned functions, AGB was used as a dependent variable (y) and VI was used as an independent variable (x) to determine change in AGB as change in VI. A total of 43 sample plots were used for analyzing the relation and model development, and 29 were used for model validation: where Y is the aboveground biomass (t·h − 1 ), VI is the value of vegetation indices, and β 0 , β 1 , and β 2 are parameters.

Model Performance Assessment.
For assessing the model performance, a statistical evaluator was used, i.e., R 2 . Generally, a higher R 2 value indicates the better performance of the model [64]. For this, the study model is built up with comparing AGB t·h − 1 with diferent 12 VIs along 5 diferent regression models, and the model with an R 2 value higher than 0.70 was further selected to determine its consistency.
To compare the best selected model of AGB with VIs, Adj. R 2 (8), RMSE, Akaike information criterion (AIC) (9), and Bayesian information criterion (BIC) (10) were used [64]. Tese criteria provide a simple and efective method for selecting and comparing regression models. Te model with a high value of R 2 adj and a low value of RMSE, AIC, and BIC was considered to be the best for AGB estimators [65]: where N � number of samples, p � number of predictor value, and R 2 is the coefcient of determination: where n � number of plots, SEE � sum of squares due to errors, and k � number of parameters.

Model
Validation. 29 sample plots were used for the model validations which are independent of the model development data. Te predicted AGB obtained from the model was correlated with the calculated AGB to observe r and R 2 (11). Furthermore, the root mean square error (RMSE) (12) was also calculated: where n � number of sample plot (29), y o is observed AGB, and y p is the predicted AGB value for the plot i in t·h − 1 .

Mapping AGB and CS.
Te AGB and CS maps for the research area were estimated using a regression model constructed using VIs and AGB. After selection of the model and its validation, an AGB CS map of the entire research area was created in ArcGIS. Te NDVI Raster map was selected in the algebra map expression in the Raster calculator, and

Correlation Coefcient and Coefcient of Determination between VIs and AGB t·h − 1 . All of the vegetation indices (VIs) demonstrated a strong correlation with aboveground biomass (AGB) at the plot level, with all 12 VIs
showing signifcance at a level of p < 0.05 at the 5% level of signifcance. A total of 60 models were evaluated (12 VIs * 5 models) using plot-level biomass, and the indices with an R 2 value greater than 0.70 were selected as the best ft for the model. Te overall R 2 value of all models has been summarized in Table 2.

Model Development and Validation.
For the purpose of the model development, 43 sample plots were used and models with the value of R 2 > 0.70 were considered valid.
Out of the 60 models tested, 18 were chosen for further analysis, with 10 being polynomial and 8 being linear models. Te development of the models was based on criteria such as high R 2 , high adjusted R 2 values, low RMSE, low AIC, and low BIC. Te parameter estimates and ft statistics for each of the selected models are summarized in Table 3. Analyzing Table 3, the quadratic form had a higher R 2 value than the linear form. Tis indicates that the quadratic model of NDVI, SAVI, and ARVI was able to explain approximately 78% of the variation in AGB per hectare, while the remaining 22% was explained by other variables that were not included in the model. Similarly, performance increased with the decrease in AIC and BIC values of variable NDVI (AIC � 313.60 and BIC � 320.65) than that of SAVI (AIC � 319.62 and BIC � 320.67) and ARVI (AIC � 313.78 and BIC � 320.85). All these 3 models show the values quite close to each other; thus, a model validation was performed using 29 sample plot data, and accuracy assessment was performed using the values of r, R 2 , and RMSE. Te scatterplot of the observed and estimated AGB was plotted for NDVI, ARVI, and SAVI as shown in Figure 3. Table 4 presents the validation of the model based on the estimated and observed AGB for the quadratic model of NDVI, ARVI, and SAVI with the criterion value of r, R 2 , and RMSE.
Analyzing the model validation result, the quadratic regression model of NDVI was selected as the best ft if the model was based on the value of r, R 2 , and RMSE than that of SAVI and ARVI as shown in Table 4. Te correlation between estimated and observed AGB gave a strong R 2 of 0.8332 and RMSE of 10.765 t·h − 1 . About 83% of the observed AGB was explained by the estimated AGB according to the quadratic model of NDVI. Te general form of quadratic function as equation is as follows: where Y is the predicted biomass, ß 0 is the y-intercept which is 26.676, and ß 1 and ß 2 are the slopes which are − 125.73 and 304.0, respectively, and X is the NDVI: Figure 4 illustrates the prediction map resulted from quadratic regression models between NDVI and AGB. Te result shows the AGB ranges mostly from 0 to 129.81 t·h − 1 . Subsequently, the CS map follows a similar pattern to the AGB map as the total amount of forest AGB was multiplied by 0.47 (47% of the AGB). Consequently, there is a higher amount of carbon content in those areas where AGB is higher. Te amount of CS ranges mostly from 0 to 61.0 t·h − 1

Discussion
For an intricate topographic structure, modeling and mapping forest AGB and CS are an extremely difcult task. For estimating AGB at the regional level, the application of freely accessible multispectral sensors could be a good substitute, especially in places where access to hyperspectral images is scarce. Terefore, we determine the model and map of the AGB and CS using Sentinel-2 (MSI) spectrally derived VIs with ground-based quantifcation of data using several regression modeling approaches in the Chure area of Sainamaina Municipality. A positive correlation was observed between the VIs and AGB. Sentinel-2 derived VIs (S2REP and NDI45) show the minimum correlation as compared to other VIs. Among them, the quadratic model of NDVI, SAVI, ARVI, and RENDVI705 possess the maximum correlation and the coefcient of determination value. Te result was addressed by Pandit et al. [21], where an R 2 value for SAVI, RENDVI705, S2REP, and NDVI was 0.81, 0.76, 0.75, and 0.70, respectively. Te correlation value of 0.89 and 0.81 for the indices NDI45 and NDVI was obtained by Nuthammachot et al. [66] in private forest in Indonesia. Muhsoni      [68] in her study stated that an exponential model explains better than a linear model when combined with AGB and VI (RERVI) with R 2 = 0.66. Askar et al. [47] gained an R 2 value of NDI45, NDVI, and S2REP to be 0.79, 0.65, and 0.19, respectively, in private forest in Indonesia using Sentinel-2. Priatama et al. [69] calculated the correlation value of NDVI and SAVI to be 0.73 and 0.80 using Landsat imagery in a postmining area. NDI45 and S2REP have shown more values of r and R 2 than NDVI, and this might be due to the use red-edge region centered at 705 (band 5), 740 (band 6), 783 (band 7), and 865 nm (band 8a) in the formula as it provides the sensor a lot of potential for monitoring various vegetation features [29]. Furthermore, studies such as [47,66,[70][71][72] have demonstrated that rededge VIs diminish saturation, particularly in complex vegetation structures. Saturation occurs particularly when vegetation reaches maturity in the case of crops [73,74], while in many cases, it is because of the complex forest structure [75][76][77] causing issues in predicting forest AGB [78]. In such a case, the VIs are unable to detect any further increases in biomass because saturation occurs when vegetation completely covers the land, which is frequently expressed as full leaf area coverage. In this case, the biomass continues to grow, while the indices remain unchanged. When compared to dense forests, VIs perform better in simple structure forests [79]. Similarly, Steininger [80] [81], and this might be due to as Dfrs collected from the many sample from the Chure area too. Diferent studies used diferent models with diferent combination of RS to estimate the biomass of diferent types of forest. Sinha et al. [77] used a combination of ALOS PALSAR and Landsat TM to estimate tropical forest biomass. Te NDVI calculated from optical image spectral bands had a poor relationship with biomass, with an R 2 of 0.29, which contradicts with the fnding of our study which might be possibly due to saturation issues in the tropical forest. Similarly, Badreldin and Sanchez-Azofeifa [82] devised a method for integrating airborne LiDAR, TLS, and multitemporal Landsat satellite images in order to determine the relationship between forest stand parameters and VIs derived from Landsat optical satellite images. Te study used stepwise multiple regression analysis, canopy height, and VIs to develop a best ft model for biomass estimation (NDVI, EVI2, and TCA). Te best model had an R 2 of 0.78 and an RMSE of 44 Mg/hectare, and the value of R 2 aligns, while the RMSE contradicts with our study. Tis might be due to the use of stepwise multiple regression and use of more than one predictor variables. Macave et al. [83] used the regression model to relate AGB with diferent combinations of sensor data using RMSE, AIC, and BIC and from feld data, and the mean AGB and CS were calculated to be 56   and BIC value does not align with our study, and as a matter of fact, Macave et al. [83] used the fusion of optical and radar data with the use of multiple regression. Likewise, Pandit et al. [21] produced an AGB map for subtropical bufer zone community forests of Nepal, where the best predictor variables from the fnal model generated by the RF algorithm were used. Sentinel-2 spectral bands and spectral-derived VIs were selected for biomass estimations with R 2 = 0.81 and RMSE = 25.22 t·h − 1 . Te RMSE does not align as our study is conducted in the tropical area; also, our study uses the parametric techniques, while Pandit uses the nonparametric algorithms as random forest (RF). Pandit et al. [43] tested two statistical approaches, namely, multiple linear regression (MLR) and RF, in estimating the AGB and concluded that the RF algorithm produced better results (R 2 = 0.95 and RMSE = 13.3 t·h − 1 ) than the MLR model (R 2 = 0.56 and RMSE = 37.01 t·h − 1 ). Te data analysis adopted in our method is solely based on the parametric test as it assumes the mathematical property of the data and provides an accuracy of 83.3% by the prediction model. However, the level of precision rises with the use of machine learning techniques. In recent years, machine learning techniques such as random forest (RF), support vector machines (SVMs), and artifcial neural networks (ANNs) have become more popular in estimating forest AGB [84]. Tese machine learning methods are a more trustworthy approach to AGB estimation [85] as they do not have predetermined model structures, and the model structure is determined by the data. Ou et al. [86] compared two parametric models (linear regression model and linear regression with combined variables) with two nonparametric models (RF and ANN) and found that the latter resulted in signifcantly higher R 2 and RMSE for forest AGB estimation. Other studies have also supported this idea, suggesting that nonparametric models are better suited to capture the heterogeneity of forest AGB than parametric models [26,87].
It is worth noting that previous RS research has yielded a variety of results for numerous modeling approaches to the empirical estimation of AGB employing RS and feld data. It is difcult to determine the superiority of one method to other. Tus, each application necessitates an evaluation of the optical bands and indices to estimate biomass [7,31], considering the forest type, topographical features, and study objectives. Te future research should not be limited to the particular sensor or modeling approaches, it should get diversifed by using various sensors as optical (Landsat, Sentinel 2A, and LiDAR) and synthetic aperture radar (SAR, Sentinel 1) with a combination of parametric (linear and multiple regression models) and nonparametric algorithms such as RF, SVM, ANN, multilayer perceptron artifcial neural network (MLPNN), k-nearest neighbor (kNN), and support vector regression (SVR).

Conclusion
Here, we investigated the utility of Sentinel-2 VIs in estimating AGB and CS using fve diferent regression model techniques in the Chure area of Sainamaina municipality.
We concluded that the quadratic regression model of normalized diference vegetation indices shows the maximum value of the correlation coefcient and the coefcient of determination. Sentinel-2 data efectively predicted the AGB and CS of the study area with an R 2 value of 0.83 and an RMSE value of 10.7657 t·h − 1 . Improvements in the spatial resolution (10 and 20 m) of Sentinel-2 MSI have the potential to enable satisfactory predictions of biomass in the Chure area of Sainamaina municipality.
Overall, our fndings show the utility, potential, and power of combining in situ data with Sentinel-2 VIs in predicting biomass. Te methodology here is relatively simple and applicable to other parts of Chure region sharing similar biophysical patterns and also where hyperspectral data are scarce. Sentinel-2 is a viable option for researchers and conservationists seeking low-cost, efcient, and freely available satellite sensor data for reliable and accurate monitoring of AGB and CS in areas where ground data are scarce. Tis study helps in the improved and sustainable forest management, climate change mitigation, biodiversity conservation, and land use planning. It can also be used in preparation of Sainamaina municipality forest and environment policy, climate change adaptation action plan, and sustainable development action plan as endorsed in Chapter 15, Section 102, Subsection 1 of Local Government Operation Act 2017 of Nepal.

Data Availability
Te data presented in this study are available on request from the corresponding authors.

Conflicts of Interest
Te authors declare that there are no conficts of interest regarding the publication of this article.