Characterization of Forested Landscapes from Remotely Sensed Data Using Fractals and Spatial Autocorrelation

1 Universities Space Research Association at NASA Marshall Space Flight Center, National Space Science and Technology Center, NASA Global Hydrology and Climate Center, Huntsville, AL 35805, USA 2 Civil and Environmental Engineering Department, University of Alabama in Huntsville, Huntsville, AL 35899, USA 3 Earth Science Office at NASA Marshall Space Flight Center, National Space Science and Technology Center, NASA Global Hydrology and Climate Center, Huntsville, AL 35805, USA


Introduction
The characterization of forested landscapes is frequently required in civil engineering practice.Examples include estimation of quantities for clearing prior to construction projects, environmental impact analysis, and characterization of hydraulic roughness for flood plain studies.Increasingly, professionals are considering the use of remotely sensed data to aid in these estimations.In 2003, the Federal Highway Administration funded a project, conducted jointly by Mississippi State University and the University of Alabama in Huntsville, AL, USA. to evaluate the efficacy of using remotely sensed data in the planning and environmental impact analysis of transportation systems.Since transportation lines (both road and rails) frequently must traverse forested areas, one major focus of the project was the use of remotely sensed data to characterize these landscapes.
The specific transportation needs that directed the project were to characterize the hydraulic roughness of flood plains for bridge design and environmental impact analysis.Recent studies have confirmed the importance of rigid, unsubmerged vegetation in determining flow depths and velocities in shallow flow situations such as commonly encountered in wide flood plains [1][2][3][4].The studies highlight the importance of stand density (ratio of obstructed area to total area) and trunk diameter.Musleh and Cruise [1] showed the relationship between these variables and classical hydraulic roughness parameters such as Manning's n and the Darcy f factor.Accurate land cover and species identification are also important aspects of environmental impact Advances in Civil Engineering analysis, particularly in identification of wetlands under the forest canopy.
Several sources of remotely sensed data are currently available that might be useful for forest characterization purposes.The data can be from satellite or aircraft platforms and can be from either passive or active instruments.A large amount of research has been performed using active remotely sensed data, particularly airborne radar to estimate forest parameters [5][6][7][8][9][10][11].Most passive remotely sensed data are much more easily accessible and cost effective than are active data.A significant amount of research has been performed on forest biomass estimation using passive instruments, particularly radiometric data [12][13][14][15][16][17][18][19][20][21][22][23][24].However, studies that employ passive radiometric data (e.g., Landsat Thematic Mapper (TM), NOAA Advanced Very High-Resolution Radiometer (AVHRR), or the Moderate-Resolution Imaging Spectroradiometer (MODIS)) usually focus on the estimation of indirect measurement of biomass or canopy coverage such as the Leaf Area Index (LAI) or Normalized Difference Vegetation Index (NDVI).Thus, it would be of great benefit if passive radiometer data could be employed to characterize forest structure such as stand density and trunk size.
In this study, some spatial analysis techniques are presented that might be employed with Landsat TM data to analyze forest structure characteristics.The spectral characteristics (7 bands) and spatial resolution (30 m) of TM data make it very suitable for use in the analysis of even moderatesized forested areas.In an earlier paper by Al-Hamdan et al. [25], the authors examined the impact of spatial and spectral resolution of remotely sensed data on the spatial indices that might be used for landscape characterization.That study concluded that Landsat TM might possess the ideal attributes for this purpose.A case study is now presented wherein fractal dimensions, along with a simple spatial correlation technique, were related to stand density parameters of the Oakmulgee National Forest located in the southeastern United States (Alabama).

Fractal Analysis and Spatial Autocorrelation Methods
One of the most essential issues in interpretation and analysis of remotely sensed data is the observation and measurement scale [26].Scale is crucial to the characterization of geospatial data because many environmental processes and patterns are scale-dependent [27].Recently developed spatial analysis approaches from a variety of science disciplines offer the possibility of highly efficient statistical characterization, analysis, and identification of spatial data in remotely sensed images.Spatial autocorrelation and fractal measurement are methods that have been developed to characterize the scaling property of spatial data [28].Although the fractal technique and other textural analyses have been applied extensively [29][30][31][32][33][34][35][36][37][38][39][40][41][42], its use as a spatial technique for characterizing remote sensing images needs to be evaluated more in terms of its utility to help characterize forest growth characteristics such as stand sizes [25,26,43,44], as demonstrated in this study.
2.1.Fractals.Classical geometry cannot provide tools for analysis of the forms of most spatial patterns of nature because they are so irregular and fragmented [44].Fractal geometry was developed as a mathematical basis for characterizing complex natural patterns [45].In classical geometry, a point has an integer topological dimension of zero, a line has one dimension, an area has two dimensions, and a volume has three dimensions [26,44].However, the fractal dimension (FD) is a noninteger value that exceeds the Euclidean topological dimension [44,45].The FD can vary between zero and one, between one and two, or between two and three; for a point pattern, a curve, or a surface, respectively [26,44].As the geometrical complexity of a perfectly flat two-dimensional surface (FD = 2.0) increases so that it begins to fill a volume, the FD values approach 3.0 [44].
The foundation for fractal analysis is self-similarity, which can be defined as a property of a curve or surface where each part is indistinguishable from the whole [26,[44][45][46].In order to define the theoretical FD, the degree of self-similarity is used and expressed as a self-similarity ratio [26,44,45].Thus, the FD of a curve can be defined as [26,44] where N s is the number of similar copies and r f is the scale reduction factor.Measuring the length of the curve using various step sizes by a procedure called the walking-divider method [46] is a common way to estimate the FD value of a curve (e.g., a coastline) [26,44].In the case of irregular curves, the length increases as the measurement interval decreases, and a linear regression can capture such an inverse relationship between total line length and step size [26,44,47]: where L is the line length, S s is the step size, B is the slope of the regression, and C is a constant.For the case of a curve, FD can then be calculated by For a raster-based remotely sensed image (surface), FD can be estimated in a similar approach using a method that is called the isarithm method [44], which was evolved from Goodchild [46], Shelberg et al. [48], and Lam and De Cola [49].With the isarithm method, a mean FD from individual FD values of gray-scale contours is computed [26].For each isarithm brightness value and each step size, the algorithm classifies each pixel below the isarithm value as white and each pixel above this value as black [26,44].Each neighboring pixel along the rows or columns is then compared to determine whether the pairs are both black or both white; if they are not of the same color, then an isarithm lies between the two neighboring pixels.The total number of boundary pixels is used to approximate the length of each isarithm line [26,47].A linear regression is performed using the logarithms of the total length of the boundary and the step size.The regression slope B is used to determine the FD of the isarithm line [26,44], where Equation (4) differs from (3) because as a flat surface grows more complex, FD increases from a value of 2.0 and approaches 3.0 as the surface begins to fill a volume [26].
The final FD of the surface is the average of the FD values for those isarithms having a coefficient of determination "R 2 " greater than or equal to 0.9 [26,44,47,50].
It has been shown in many previous forestry research studies [51][52][53][54][55][56][57][58][59][60] that there is a significant positive relationship between crown width and the Diameter at Breast Height (DBH) for both hardwood and softwood species.Thus, an increase in the stand diameter implies an increase in the crown width and vice versa.As illustrated in Figure 1 [25], if continuous small crown trees are covering two adjacent remotely sensed pixels of a similar area, the result is two homogenous surfaces.Thus, the integration of the brightness levels within each pixel (i.e., pixel value) will be similar in magnitude.If the pixel values do not vary significantly, the result is less complexity in terms of fractals (smaller FD) and more homogeneity in terms of autocorrelation [25].On the other hand, if the pixels are covered with large crown trees, the result is nonhomogenous adjacent pixels leading to more complexity in terms of fractals and less homogeneity in terms of autocorrelation [25].

Spatial Autocorrelation. Another index that can be used to analyze the spatial autocorrelation of images is Moran's I.
Moran's I [61] is an index of spatial autocorrelation which reflects the differing spatial structures of the smooth and rough surfaces [47].Thus, it is a potential technique in characterizing and estimating measures of forest-related surface roughness based on crown diversity.Moran's I is a spatial correlation statistic and is calculated from the following formula [47]: where I(d) is Moran's spatial autocorrelation at distance d, w i j is weight at distance d, so that w i j is 1 if point j is within distance d of point i, otherwise w i j = 0, z i is deviation (i.e., z i = x − x mean for variable x).W = the sum of all the weights where i / = j.Moran's I can vary from +1.0 for perfect positive autocorrelation (a clumped pattern) to −1.0 for perfect negative autocorrelation (a checker board pattern) [44].Moran's I is different from the FD in that the FD is focused on the object shape, size, and the tortuosity of the edges of these objects [47].Moran's I does not explicitly consider the shapes and sizes of objects once the weights, w i j , in (5) are determined [47].

Case Study
Oakmulgee National Forest is located in central Alabama and encompasses an area of 128,638 ha (317,861 acres) (Figure 2).According to the US Forest Service inventory analysis, there are three size classes present within the forest data sets, namely, sawtimber, poletimber, and saplings.This size classification was based on the diameter at breast height (DBH) of the tree trunk.The DBH values for those classes are greater than 22.86 cm (9.0 in), from 12.7 cm (5.0 in) to 22.6 cm (8.9 in), and from 2.54 cm (1.0 in) to 12.45 cm (4.9 in), respectively.Forest species includes softwood and hardwood trees.Longleaf-slash pine, shortleaf-loblolly, and cypress are examples of softwood trees (Figure 3(a)).White oak, red oak, hickory, sweetgum, ash, and yellow-poplar are examples of hardwood trees (Figure 3(b)).The forest database included digital maps linked to attribute tables containing the stand characteristics (i.e., tree types, sizes, and species) of the forest.Other spatial data were also collected to characterize the region for geoidentification and location purposes.These data included state and county maps, roads, streams, and topographic images (digital line graph data) that were useful in providing visual locations of points within the forest area.The sources of these data were the Environmental Systems Research Institute (ESRI) and the US Bureau of the Census Topologically Integrated Geographic Encoding and Referencing System (TIGER).
The remote sensing data consisted of Landsat TM images of the forest area taken in summer 1999.The TM instrument records energy reflected in seven bandwidths as shown in

Image Characterization and Modeling System (ICAMS)
ICAMS is a software module that was developed to measure, characterize, and model multiscale remotely sensed data [62,63].It contains a robust set of fractal measurement algorithms embedded in a GIS-type architecture.It contains a number of spatial measurement methods that can be used to address a variety of significant issues related to scale and fractal analysis.
The main functions of ICAMS are image input, image characterization, specialized functions, and image display and output.The image input subsystem includes basic image processing functions, such as file transformation, georeferencing, image registration, and a variety of image viewing capabilities [62].The image characterization subsystem provides users with an array of spatial and nonspatial measures for characterizing image data.The non-spatial measures include basic descriptive statistics measures (i.e., mean, mode, median, and variance) and histograms.The spatial measures include fractal analysis [49], variogram analysis [64,65], spatial autocorrelation indices [66,67], and textural measures such as local variance [68].The specialized subsystem of ICAMS provides aggregation routines for aggregating pixels to simulate multiscaled data for scale effect analyses [26].The image output subsystem contains functions to output original images or derived products in two-dimensional or three-dimensional form.ICAMS provides the ability to calculate the FD of remotely sensed images using the isarithm method [49], variogram [65], and triangular prism methods [69].Lam et al. [70] found that the isarithm method calculates the FD fairly accurately and more so than the other two methods, thus the isarithm method was used in this study.

Procedure
The procedure followed in the case study is illustrated in Figure 4. Raster and vector data were collected for the Oakmulgee National Forest as described previously.The raster data (TM images) were then imported into the ER Mapper remote sensing software package in order to trim, locate, and crop the images of the study area and to export the data to the ARCINFO 8.2 GIS package.The relevant vector data (state and county lines, roads, streams, DLG (topographic) data, and National Forest delineations) were also imported into ARCINFO and overlaid onto the raster images.Figure 5 illustrates some attributes of the data set.Samples were collected randomly covering of all parts of the forest.As shown in Figures 2 and 5, the area is bisected by the Cahaba River, so care was taken to obtain a representative number of samples from both sides of the river.In addition, the elevation of the forest ranged from 60 m to 170 m above mean sea level with an average of 130 m.So, again care was taken to obtain samples representative of all topographic conditions so as not to bias the results.Criteria for the selection of the window (number of pixels in a sample) were based on the resolution, minimum mapping unit size, and nature of the classes (size of the regions, characteristic scale, directionality, and spatial periodicity) to be identified [71].It is understood that a smaller window size does not convey sufficient spatial or texture information to characterize land surfaces.On the other hand, if the window size is too large, too much information from other surfaces could be included and hence the algorithm might not be efficient.The sample size for this study was chosen to be 100 × 100 pixels based on the average sample size used in the research literature [28,71].As shown in Figure 5(e), a total of 52 samples were collected for Oakmulgee National Forest.It can be noted that the sample size in the study area was larger than 30 and hence the distribution of the sample means can be approximated reasonably well by a normal distribution for statistical purposes [72].
Information was then collected for each sample including size class (sawtimber, poletimber, saplings), species groups (hardwood and softwood), age, and elevation using the national forests vector GIS data obtained from the Forest Service and the digital elevations GIS data obtained from the Earth Resources Observation Systems (EROS) Data Center.
For each sample, the percentage of each size class and species group was determined using the digital map of the national forest based on the area covered by each size class and species group in the GIS.Also, the average elevation for each sample was determined using the Digital Line Graph (DLG) data in the GIS.Table 2 shows the in situ forest data and the average elevations for all samples collected from all the study area.The raster (TM) data of the area were then imported to the GIS module, ICAMS, where the spatial analytical techniques of fractals and autocorrelation were conducted and the spatial indices of FD and Moran's I were found.Figure 6 shows an example of two subset images from TM Band 1 and their associated FD and Moran's I.The FD and Moran's I values were found for all bands of Landsat TM and for all 52 samples.The averages of FD and Moran's I for each sample were then calculated using the results of all Landsat TM bands except the thermal Band 6.Finally, the results were analyzed in terms of the relationships between the image complexity indices and the forest characteristics.

Analysis of Variance (ANOVA) Tests.
First, the ability of the spatial complexity indices to distinguish between the different size classes and species groups was examined.ANOVA is a technique that is useful to test for the equality of several means simultaneously.It is a method for splitting the total variation of the data into meaningful components   that measure different sources of variation [73].One-way ANOVA is analysis of variance when there is one independent variable while two-way ANOVA is analysis of variance when there are two independent variables.
To study the effectiveness of remote sensing analytical techniques to distinguish between two size classes (i.e., two independent factors) and also between two species groups, a number of two-way ANOVA tests [73], were conducted using the average spatial analytical indices (i.e., FD, Moran's  I) as the dependent factor and the tree size classes or species groups as the independent factor.Rejecting the null hypothesis that the column effects are equal to zero means rejecting the hypothesis that there is no difference in the percentages of the two size classes or the two species groups.As discussed previously, the average values of FD and Moran's I were computed from the samples using the results of all TM bands except band 6 (thermal infrared).Two-way ANOVA tests were conducted for two of the size classes at a time.The significance level was chosen to be 0.05 in order to be consistent with the research literature in the areas of remote sensing and land characterization [47,51,74,75].The first test rejected the null hypothesis of no difference between the percentages of sawtimber and poletimber classes at the 0.05 significance level using the average FD as the dependent variable.Similarly, ANOVA rejected the null hypothesis of the percentage of trees in the sawtimber and saplings size classes using the average FD.ANOVA also rejected the null hypothesis for the percentage of trees in the poletimber and saplings size classes using the average FD at the 0.05 level.Thus, it appears from these results that there is sufficient fidelity in the FD computed from the TM data that one can distinguish between timber size classes with a reasonable degree of statistical confidence.The same battery of tests was also run using the average Moran's I as the dependent variable with similar results.The two-way ANOVA test results appear to show that Moran's I values computed from TM data can also be employed to distinguish between timber size classes with reasonable statistical confidence.
Two-way ANOVA tests were also conducted using different species (i.e., softwood and hardwood) as the independent factors.The tests rejected the null hypothesis of no difference in percentage of hardwood and softwood trees in the samples using both the FD and Moran's I as dependent variables at the 0.05 significance level.

Regression Analyses.
The FD is plotted against the percentage of forest trees in each class category in Figures 7 through 9. First, the figures demonstrate that the FD has definite boundaries depending on the percentage of trees in the larger (sawtimber) and smaller (saplings) classes.The maximum FD is 2.89 for a sample with a saplings percentage of 0% and sawtimber percentage of 90%.Conversely, the minimum FD was 2.67 for a sample with a sapling percentage of 66% and sawtimber percentage of 34%.The figures also illustrate that the FD remains fairly constant around the value of 2.7 for samples with either greater than 50% saplings or less than 50% sawtimber.These observations are consistent with the reasoning outlined earlier, that is, trees with smaller crown widths will produce more homogeneous canopies (smaller FD) than will trees with larger crown widths.It appears from these results that when the percentage of smaller trees (saplings) within the forest becomes greater than 50%, then the canopy image becomes so homogeneous that the lower fractal bound is reached.Thus, the FD cannot be determinative for sapling percentages beyond 50%.However, in the region between the boundaries, there appears to be a fairly linear relationship between the FD and the percentage of class category.Samples with greater Advances in Civil Engineering than 50% saplings or less than 50% sawtimber were excluded from the data set (a total of 10 samples) and linear functions were fitted to the remaining data as shown in Figures 10-12.Both Figures 8 and 11 show that there is considerable more scatter in the poletimber data than is present in the other two classifications.This observation is also consistent with the general concept discussed previously in that the poletimber represents a forest structure that is dominated by neither small trees nor large trees and thus these samples would demonstrate a greater range of image complexity (and greater variation in FD) than would the sawtimber or saplings samples.
The situation is similar in the case of Moran's I statistic (Figures 13-15).Again, the statistic appears to demonstrate definite bounds with a maximum of about 0.9 for samples with greater than 50% saplings and a minimum of about 0.6 for a sawtimber percentage of 94%.Again, as in the fractals case, there appears to be a fairly linear relationship between Moran's I and the class percentages between the boundaries (Figures 16-18).However, as in the case of fractals, Moran's I cannot be considered a definitive measure of stand density for samples with greater than 50% saplings.
As for the species groups, there appears to be a fairly linear relationship between the FD and the species group percentages as well as between the Moran's I and the species group percentages (Figures 19-22).All the regressions showed an increase (positive slopes) in FD and a decrease (negative slopes) in Moran's I as the hardwood percentages increased while all the regressions showed a decrease (negative slopes) in FD and an increase (positive slopes) in Moran's I as the softwood percentages increased.The explanation is the same as given for the stand sizes case because softwood trees (e.g., Pine trees) are mostly with small crowns, while hardwood trees (e.g., Oak trees) likely have large crowns.As a matter of fact, the species group case had even stronger correlations with the average spatial indices than the DBH case.This can be due to the fact that remote sensing data do not measure DBH directly, but they measure crown reflectivity by satellite sensors.So, for a given species group, the reflectance value recorded by satellite sensors is a function of exposed projection area (canopy closure).The strong relationship between the spatial indices and both types of species groups hardwood and softwood is very important to the potential prediction process of the trunk diameter from the spatial indices, because it suggests that whether the tree is a softwood or a hardwood would not affect the relationship between its trunk size and the spatial indices.In other words, having different species groups would not skew the potential trunk size predictions.Thus, potential forest tree trunk size prediction models would be valid for any species groups.

Conclusions
The results of the case study presented herein appear to confirm the basic theory pertaining to the application of fractal and spatial analyses to forest canopies.The relationships between image complexity and forest stand characteristics appear to be confirmed up to a point.However, it appears that as the percentage of smaller diameter trees becomes greater, and particularly if it exceeds 50%, then the canopy image obtained from Landsat TM data becomes sufficiently homogeneous so that the spatial indices reach their lower limits and thus are no longer determinative.
It also appears, at least for the Oakmulgee forest, that the relationships between the spatial indices and forest class percentages within the boundaries can reasonably be considered linear.The linear relationship is much more pronounced in the sawtimber and saplings cases than in samples dominated by medium sized trees (poletimber).The large variation in the poletimber data is consistent with the theory of image complexity of forest surfaces as put forth in this work.From a civil engineering applications perspective, the results indicate that up to a point where smaller trees dominate the landscape, forested areas can be differentiated according to breast sizes and thus important flood plain characteristics such as ratio of obstructed area to total area can be estimated from remotely sensed data.This would facilitate the estimation of hydraulic roughness coefficients (e.g., Manning's n) for computation of flood profiles needed for bridge design.Even in the case where smaller trunks do dominate, then the knowledge of that fact alone would also facilitate the estimation of the required parameters.Likewise, remotely sensed indices such as fractals or Moran's I can distinguish hardwood and softwood species with good accuracy.Thus, forested landscapes can be mapped for preliminary environmental impact analysis.
These results appear to indicate that both fractal dimensions and spatial autocorrelation indices hold promise as means of estimating forest stand characteristics from remotely sensed images.However, additional work is needed to confirm that the boundaries identified for Oakmulgee National Forest and the linear nature of the relationship between image complexity indices and forest characteristics are generally evident in other forests.In addition, the effects of other parameters such as topographic relief and image distortion due to sun angle and cloud cover, for example, need to be examined.

Figure 1 :
Figure 1: Size class effect on remotely sensed data: (a) small crown trees and (b) large crown trees (courtesy of Al-Hamdan et al. [25]).

Figure 2 :
Figure 2: Location map of study area.

Figure 5 :
Figure 5: Overlaying and sampling process for Oakmulgee National Forest: (a) Landsat TM image, (b) counties, roads, and city locations (c) DLGs, (d) forest stands, and (e) overall layers and samples.

Figure 6 :
Figure 6: Two subset images from TM Band 1 and their associated FD and Moran's I.

Table 1 .
As shown in the table, the spatial resolution of the Landsat TM images is 30 meters except for band 6 which is 120 meters.For consistency purposes, the data recorded in Band 6 were excluded from these analyses.
[26,43,62]dy sites; that is, all samples would have similar atmospheric effects.Additionally, previous work employing spatial statistical techniques in the Image Characterization and Modeling System (ICAMS), particularly fractal analysis of remote sensing data, has implied that there is little influence of atmospheric effects on the overall statistical analysis results[26,43,62].

Table 2 :
In situ forest data of Oakmulgee National Forest, AL, USA.