Height-Diameter Modeling of Cinnamomum tamala Grown in Natural Forest in Mid-Hill of Nepal

Tree height (H) and diameter at breast height (D) are key variables to calculate tree volume and biomass. We developed a heightdiameter (H-D)model for Cinnamomum tamala by evaluating 18 nonlinear models. Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), Root Mean Square Error (RMSE), mean bias, Mean Absolute Error (MAE), graphical appearance, and biological logic were the criteria used to evaluate the predictive performance of the models. Gompertzmodel (M14) performed the best for predicting the total height of C. tamala trees with the least RMSE (1.742m), mean bias (0.012m), and MAE (1.342m) and satisfied model assumptions and biological logic. Validation data ranked the Gompertz model as the best model with RMSE (1.546m), mean bias (-0.106m), and MAE (1.149m). Despite the consistent performance of the Gompertz model, it tended to underestimate the height prediction for taller (dominant crown class) and larger trees. Further work on refitting and validation of the proposed model with data from a larger geographic area, wider-ranging sites, and stand conditions is recommended.


Introduction
Tree height (H) and diameter at breast height (D) are fundamental variables in most forest inventories which are required to calculate tree volume, biomass, carbon storage, and survival analysis [1][2][3][4].Accurate in situ measurement of D is easy and cost-effective.However, height measurement is labor-intensive, time-consuming, expensive, and prone to observational and measurement errors [5,6].Predicting forest dynamics through growth and yield simulation also requires individual tree level information, such as H and D. A height-diameter (H-D) relationship model can be easily built when both H and D variables are measured.The model then can be used to estimate missing tree height, biomass production, and stand dynamics.
The allometric equation [7] serves as a tool to relate easily measurable morphometric variables (e.g., D) to the total height of the tree.Conventionally, H-D relationship models have been developed and applied to pure even-aged stands or plantations using D as a predictor variable [1,2].Recent studies employ other stand attributes (e.g., site quality, stand age, stand density, and dominant height) in mixedeffect H-D relationship models [8][9][10][11][12][13].In mixed effects H-D relationship models, population-averaged (fixed parameters common to population), as well as subject-specific effects as random effects, are allowed.Mixed-effect H-D relationship models improve accuracy over nonlinear models that are based on the minimization of sums of squares.
Generally, the site-specific H-D relationship models that are developed as H-D relationship can vary with differences in age, site quality, competition, stand age, and silvicultural treatment applied [14][15][16][17].Although stand specific H-D relationship models are labor-intensive, costly, and timeintensive, these models have been proved to be more accurate in the realistic description of forest structure, growth simulations, and plot-level volume [18].Due to a limited geographical range of natural C. tamala stands, cross-sectional nature of data, and for simplicity of model use, we employed 2 International Journal of Forestry Research ordinary nonlinear least square equations.The model developed should be considered as a site-specific model as it was built considering fewer stands.
Cinnamomum tamala T Nees and Eberm (family Lauraceae) is a moderate-sized, evergreen, monoecious tree species distributed at 900-2000 m above sea level in the tropical and subtropical Himalayas [19].Natural stands of C. tamala, being a moderate shade-tolerant species [20], are mostly found in shady-moist sites and grow well under full illumination.C. tamala is native to Nepal, China, India, and Bhutan [21].C. tamala trees are extensively managed, in situ and ex situ, for leaf and bark production due to dwindling natural population and high economic potentials [22].The bark is traded to India [19] and is often used as a substitute for Cinnamomum zeylanicum [23].Department of forests, Nepal [24], has prioritized C. tamala for research and management because C. tamala bark forms a significant part of national nontimber forest product trade, by both volume and economic value.
Despite receiving a priority for research and management in Nepal and conservation in India, studies on C. tamala forests are limited.Since bark is a highly traded product, H-D relationship model is an essential decision-making tool both for management institutions and government for estimating forest volume and above-ground biomass, growth and yield modeling, and modeling ecophysiological processbased models of forests.However, research in this direction is limited within the distribution range of C. tamala and has mostly focused on conservation, ethnopharmacological properties, dispersion pattern, cultivation, and harvesting practices [25][26][27][28][29].In Nepal, Tree Improvement and Silviculture Component, TISC [30], and Poudel et al. [31] developed bark biomass production models in farm-grown trees and natural forests [32].Equations for the H-D relationship for Cinnamomum tamala grown in the natural forest have not been reported in past studies.Tree height increases nonlinearly with stem diameter [33,34].Therefore, linear models are inadequate for predicting tree height [35].In this paper, we evaluated 18 nonlinear models to develop the H-D relationship model for C. tamala trees.To the best of our knowledge, this is probably the first attempt to establish an H-D relationship model for natural C. tamala forests.Our model would be beneficial to researchers, managers, and academicians within the distributional range of C. tamala.However, precaution should be used for accurate prediction of height outside the size and height ranges, site, and stand condition that differs from our study.

Data Collection and
Processing.We collected the data through the inventory of natural stands of C. tamala, which includes 30 destructively sampled trees covering a wider range of H and D. Trees were selected for felling covering all the site conditions and tree size (H and D) distributions across the study sites in and outside of sampling plots.Nondestructive sample data came from a phytosociological survey of same forests [27].We used systematic sampling plots with a size of 10 m × 10 m.Each plot has 5 m × 5 m subplots at northeast and southwest corners and 1 m × 1 m subplots at four corners.Information on species such as H and D were collected from 10 m × 10 m and 5 m × 5 m plots.The combined dataset size (n = 179) is smaller compared to other studies (e.g., [12,13,18,48,49]), but it can be used to construct H-D relationship models [50][51][52].The combined dataset was split randomly into fitting (80%) and validation (20%) using R [53].Summary statistics for H and D for both fitting and validation dataset are presented in Table 1.

Model Development and Evaluation
. Initially, we selected 18 candidate nonlinear H-D relationship models (11 biparametric, and seven triparametric) based on a literature review (Table SA) to fit the model.We assumed that the total tree height is a function of diameter.In the later stage, we included only the models with all significant parameter estimates at 5% level of significance for further evaluation to compare their performance (Table 2).These models take the following functional form: where H i is the i th observation of the dependent variable tree height (m), D i is the i th observation of the independent variable diameter at breast height (cm), b is the vector of parameter to be estimated,  i is the random error term, f (.) is a nonlinear function, and i is the i th observation with i = 1, 2, 3, . .., n.A constant term 1.3 is the height of stem above ground at which D was measured.We added this term in the function to avoid prediction of zero H when D approaches zero.The nonlinear function "nls2", an improved version of the function nls, in statistical software R [54,55] was used to estimate model parameters using the ordinary least square method.There are several criteria to assess model performance [56].We evaluated model performance based on International Journal of Forestry Research 3 H i is total height (m) of tree i = 1, 2, 3,. ..,n,D-diameter at breast height (cm), b 0 , b 1 , and b 2 are parameters to be estimated, 1.3 is added to avoid prediction of zero height when D approaches zero, and  i is a random error term which is assumed to be normally and identically distributed with mean 0 and variance  2 [NID (0,  A coefficient of determination is not advisable to employ for assessing nonlinear regression.A statistic analogous to the coefficient of determination also called a fit index or modeling efficiency [14,59] could be used.However, in this study, we did not use the fit index to evaluate our models as a fit index also suffers all the weaknesses of the coefficient of determination [14].
where y i , ŷi are the observed and predicted values of height, respectively; n is the total number of observations used in model fitting; ln is natural logarithm; RSS is the residual sums of squares; and k is the number of parameters in the model.Models with least RMSE, AIC, BIC, mean bias, and MAE are considered to perform the best [60].Reliability and validity of the model increase with its performance outside the test data [61][62][63].However, due to limitation of dataset, model validation with an independent dataset covering more extensive geographical regions and management condition across the distributional ranges of species can be a potential topic for future research.
We evaluated models based on the significance of model parameters, performance criteria, and practical application.After initial performance screening, models were subjected to further comparison by ranking the performance criteria and considering the practical application of the model.The models with the least value of performance criteria (RMSE, AIC, BIC, mean bias, and MAE) received the highest ranking and vice versa.The rank value of each criterion was added to get the total sum of ranking, which was then ranked in ascending order-the lower the value, the better the model.The model with the smallest rank value was considered the best H-D relationship model of C. tamala naturally grown in mid-hill of Nepal.15) models (Table 3) showed significant parameter estimates (p<0.05,unless otherwise mentioned) and were considered for further evaluation.

Model Performance and Selection.
Parameter estimates of models indicate that the best model cannot be determined solely based on significant t-statistics.We additionally considered performance criteria values to assess the model's performance.Table 4 presents the parameter estimates and fit statistics associated with the selected models for fitting and validation dataset.
We ranked models based on values of performance criteria and calculated average rank on sum ranks of performance criteria.M14 performed the best (rank = 1) for the fitting dataset, while M14, M12, and M4 performed the best with equal rank (rank = 1) for the validation dataset (Table 5).Although M14, M12, and M4 have identical rank for validation dataset, we considered M14 the best model because it performed the best (rank = 1) for fitting dataset too, while M4 and M12 were the third best models for fitting dataset.Parameter estimates of M4 and M12 for validation dataset (Table SB), graph of the model, and residual distribution (Figures SA and SB) confirm that the models M4 and M12 are the best alternatives to model M14.Interestingly, M12 an M4 yielded strikingly similar performance statistics (identical to the three-decimal place) for model fitting, validation, and combined dataset (Tables 4 and SB).

Residual Analysis and Shapiro-Wilk Test.
To come up with the best model, we used test for residual normality.Residuals graphs (histograms, probability plots) and plot of fitted line overlaid on the observed heights data were produced for a visual check.Figure 1 presents the plot of residuals against the observed diameters of fitting data.Most of the models showed no systematic departure from random pattern.
The Shapiro-Wilk test of residuals of the best three models [M14-(W = 0.990, p = 0.4143) for fitting and (W = 0.978, p = 0.6858) for validation; M12-(W = 0.984, p = 0.627) for fitting and (W = 0.984, p = 0.871) for validation; M4-(W = 0.984, p = 0.627) for fitting and (W = 0.984, p = 0.871)] showed a p value greater than 0.05, which suggests that residuals do not violate the assumption of normal random error.Figure 2 shows curves of the best fitting model M14 for fitting and validation dataset and histogram of residuals with a superimposed normal curve.Normally, the model tended to overestimate for shorter trees with larger diameter size.To determine the better model between M12 and M4, we compared the predicted height from both models between them and with the predicted height from the best model, i.e., M14.Our results of paired t-test to compare the predicted heights showed significant statistics between M4 and M12 [t (142) = 3.1143, p = 0.002] and between M12 and M14 [t (142) = -2.8968,p = 0.0044].The M4 showed no significant difference in mean predicted heights from the M14 [t (142) = -0.8743,p = 0.3983].This suggests that the M14 has more accurate prediction power followed by M4.

Discussion
Almost all the candidate models performed well in explaining H-D relationship for C. tamala trees grown under natural forest.Model selection based on multiple criteria suggested that M14, M12, and M4 were more effective in predicting C. tamala tree height compared to other candidate models.[52,64]).Zhang (1997) selected six models for evaluating H-D relationship and found that the Gompertz model underestimates tree heights for larger sizes trees (D>100 cm).Similarly, in our case, the Gompertz model tended to underestimate heights (10.84%) for taller trees with a larger diameter (H ≥12 m and D ≥17 cm).C. tamala trees from dominant crown category grow taller, larger, and faster.Trees under shade allocate more biomass to diameter growth than height as height growth requires more competitive advantage [65,66].This indicates that C. tamala H-D relationship at our study site might have been mediated by shade tolerance or crown dominancy.
Studies have shown that shade tolerance and climate jointly effect on allometric variation [66][67][68].However, assessing the possible variation in C. tamala H-D relationship is beyond the scope of this study due to the small size of our dataset.H-D relationship models are often developed based on sampling data from permanent sample plots that are monitored for a relatively extended period [59,64,69,70].Studies that employed longitudinal data along with other plot-level covariates in nonlinear mixed H-D relationship models show improved model performance compared to ordinary nonlinear least square models [18,52,59,71,72].Although the inclusion of stand variables may improve the prediction power of the model, this often requires greater sampling efforts especially when data collection involves an International Journal of Forestry Research invasive sampling approach.Our study did not include such covariates because of the cross-sectional nature of data that were collected from both inside and outside of sample plots.Future researchers can test the validity of our model and improve the predictive performance utilizing stand and plotlevel covariates when available.Predictive performance of the models beyond the observed range was not possible as the fitting dataset has larger sizes and taller trees.Independent datasets from different geographical locations and management systems with larger and taller trees than those used in this study could confirm the predictive power of the model [14].Motallebi and Kangur [73], using the annual height and diameter increment data of 114 trees (observation periods spanned from 40 to 115 years) from three European countries, indicated a variable allometric relationship (significant coefficients across sites) between tree height and diameter which did not follow elasticity or geometric scaling rules across sites.H-D relationship varies among ecoregions due to differences in bioclimatic conditions [48,69,74] and among management regimes [32].Therefore, H-D relationship of species should be evaluated across climatic and management  systems to understand the changes in the stem allometry of species.As C. tamala prefers moisture rich areas, water stress could change the H-D relationship.Therefore, the inclusion of water availability in original H-D relationship models and a mixed-effect modeling approach could further optimize the model's performance.Since this study's fitting and validation data came from same climatic and management conditions, validation of these models with independent data from different geographic, climatic, and management practices could be a potential topic for future research.

Conclusion
We calibrated and tested 18 nonlinear models on Cinnamomum tamala trees grown under the natural multilayer tree structure in mid-hills of Nepal.The best model was selected based on multiple performance criteria, error diagnostics, growth theory, and practical application.Results showed that the Gompertz model (M14) best fitted the fitting dataset and performed consistently on the validation dataset and can be used reliably in biomass estimation and management for C. tamala trees in the mid-hills of Nepal.
International Journal of Forestry Research The existing studies on C. tamala height-diameter (H-D) modeling are limited to even-aged stands or plantation and have not attempted to develop H-D relationship models under natural forests.The methodology presented in this paper utilizes ordinary nonlinear models to generate H-D relationship in the uneven-aged natural forest of C. tamala.Since we considered diameter as the only predictor variable, caution should be applied while using our model to other forests stands that differ in site and stand conditions to the basis of this study.In future research, our model can be extended to other regions and management regimes (e.g., the tree outside forestry), following updates through refitting and validation against independent data from the broadest possible ranges of size, site, and stand conditions, including stand and plot attributes across the distributional ranges of C. tamala.

Figure 1 :
Figure 1: Residuals analysis plots of 15 H-D relationship models fitted for Cinnamomum tamala.

Figure 2 :
Figure 2: Scatter plot of total height (H) against D superimposed with M14 for fitting data (a) and validation data (c); a histogram of residuals of M14 for fitting data (b) and validation dataset (d).
[27]l, which extends from 28 ∘ 13  57  to 28 ∘ 20  57  N latitude and 84 ∘ 08  53  to 84 ∘ 12  42  E longitude.The forest in the study area is managed by the Sikles unit of Annapurna Conservation Area Project (ACAP) and constitutes C. tamala as a dominant species having contiguous distribution pattern and with an Important Value Index (IVI) of 158.0[27].

Table 1 :
Descriptive statistics of the variables for both fitting and validation dataset.

Table 2 :
Models selected for further evaluation, their designation, and parameters of the models.

Table 3 :
Parameter estimates of the selected models.

Table
Performance statistics of selected models for fitting and validation dataset.

Table 5 :
Rank of selected models based on performance criteria for fitting and validation dataset.

Table
Prediction statistics of the best model (M14) for validation dataset.