Classification of Vegetation to Estimate Forest Fire Danger Using Landsat 8 Images : Case Study

The vegetation cover of the Earth plays an important role in the life of mankind, whether it is natural forest or agricultural crop.The study of the variability of the vegetation cover, as well as observation of its condition, allows timely actions to make a forecast and monitor and estimate the forest fire condition.The objectives of the researchwere (i) to process the satellite image of theGilbirinskiy forestry located in the basin of Lake Baikal; (ii) to select homogeneous areas of forest vegetation on the basis of their spectral characteristics; (iii) to estimate the level of forest fire danger of the area by vegetation types. The paper presents an approach for estimation of forest fire danger depending on vegetation type and radiant heat flux influence using geographic information systems (GIS) and remote sensing data. The Environment for Visualizing Images (ENVI) and the Geographic Resources Analysis Support System (GRASS) software were used to process satellite images. The area’s forest fire danger estimation and visual presentation of the results were carried out in ArcGIS Desktop software. Information on the vegetation was obtained using the analysis of the Landsat 8 Operational Land Imager (OLI) images for a typical forestry of the Lake Baikal natural area. The maps (schemes) of the Gilbirinskiy forestry were also used in the present article.The unsupervised k-means classification was used. Principal component analysis (PCA) was applied to increase the accuracy of decoding. The classification of forest areas according to the level of fire danger caused by the typical ignition source was carried out using the developed method. The final information product was the map displaying vector polygonal feature class, containing the type of vegetation and the level of fire danger for each forest quarter in the attribute table.The fire danger estimationmethod developed by the authors was applied to each separate quarter and showed realistic results. The method used may be applicable for other areas with prevailing forest vegetation.


Introduction
Forest fires pose a global threat to natural systems and humans.Every year the number of forest fires in the world is growing.The problem of forest fire danger estimation in the Lake Baikal basin is of particular relevance [1][2][3].The anthropogenic impact on the forest is constantly increasing.On the other hand, the number of natural disasters is increasing.The forest fire danger classes of the territory were determined in the process of the basic forest management and are based on forest types, rock groups, age of forest stands, and proximity to local infrastructure (roads, settlements, and industrial objects).The use of satellite imagery data to determine or specify the probability of forest fires is one of the best possible solutions.Satellite remote sensing has become the main source of data for predicting fire risk, mapping of forest fuel, detecting and monitoring forest fires, and estimating the damage done to vegetation after a forest fire [4][5][6][7][8][9][10].
The ability to determine the place where a fire is likely to occur is a prerequisite for the fire management.The vegetation cover is the most important factor in forest fires, because it reflects the presence of forest fuels [11].The use of remote sensing data in the classification and mapping of vegetation becomes the main method of fuel assessment.Existing methodologies for determining the type of vegetation 2 Mathematical Problems in Engineering include the traditional classification with and without training [12,13], the method based on phenology [14], and the object classification [15].
The vegetation type maps are necessary for spatial calculation of fire danger and for estimation of fire risk by using them in models that simulate growth and intensity of a fire in a landscape.Forest fuel maps are used in various widespread systems for the forest fire danger forecasting.For example, such models are used in the North American models National Fire Danger Rating System (NFDRS), Fire behavior (BEHAVE), Fire Area Simulator (FARSITE), and National Fire Management Analysis System (NFMAS) [17][18][19][20].The McArthur Forest Fire Danger Rating System and the McArthur Grassland Fire Danger Rating System [21] are widely used in Australia.The Canadian Forest Fire Danger Rating System is used in Canada, which consists of two main subsystems: the Fire Weather Index (FWI) and Fire Behavior Prediction System [20,22].The Forest Fire Satellite Monitoring Information System of Russian Federal Forestry Agency (SMIS-Rosleshoz) used in the Russian Federation is based on the Nesterov index.Ground-based observations, aerial photography, modelling, and remote sensing data [20,[23][24][25][26][27] are used for mapping.In general, data on the greenness of vegetation, meteorological data, data on wetting the surface, and moisture content of forest fuel are necessary to forecast and to monitor forest fires [28].
In addition, numerous studies showed the feasibility of geographic information system (GIS) and remote sensing data [29,30].The GIS is a widely used tool for processing spatial data and displaying results.A model of a potential forest fire using satellite data and the GIS was developed in China to identify areas with a high probability of forest fire [31].In Sheriza et al. [32] five fire danger classifications were identified in developed forest fuel maps and the degree of fire danger in peat-bog forests was assessed.The vegetation types in the study area were analyzed using digital classification systems, namely, two vegetation indices: the extended vegetation index (AVI) and the Tasseled Cap (TC) conversion.
The objectives of the research were (i) to process the satellite image of the Gilbirinskiy forestry located in the basin of Lake Baikal; (ii) to select homogeneous areas of forest vegetation on the basis of their spectral characteristics; (iii) to estimate the level of forest fire danger of the area by vegetation types.
The paper presents an approach for estimation of forest fire danger depending on vegetation type and radiant heat flux influence using GIS and remote sensing data.The ENVI and the GRASS software were used to process the satellite images.The area's forest fire danger estimation and visual presentation of the results was carried out in the ArcGIS Desktop software.The final information product was the map displaying vector polygonal feature class, containing the type of vegetation and the level of fire danger for each forest quarter in the attribute table.
The structure of the article is following: Section 2 contains descriptions of methods; Section 3 describes the results and contains discussion; Section 4 contains conclusions.

Study Area and Data
Resources.The study was conducted in the Gilbirinskiy forestry (Ivolginskiy district of the Republic of Buryatia) in a protected natural area.The study area is located between latitudes 51 ∘ 35  50  N and 51 ∘ 46  40  N and longitudes 106 ∘ 42  00  E and 107 ∘ 02  00  E [33].This area is 55 km southeast of Lake Baikal, and belongs to the northern tip of the Selenga middle mountains (Figure 1).The area of the forestry is about 270 km 2 [34].The forestry is divided into 126 forest quarters.The forest quarter is the main accounting unit of the forest fund in the Russian Federation and it has permanent boundaries.
The region of the study, like the whole of Buryatia, has a sharply continental climate, with cold winters and hot summers.The average temperature in summer is about + 18.5 ∘ C and in winter is about -22 ∘ C, and the average annual temperature is about -1.6 ∘ C. The average annual rainfall is 244 mm.A significant feature of the climate is the long duration of sunshine for 1900-2200 hours [35].Natural plantations occupy about 90% of the territory.The rest of the territory is filled with glades, slopes, pebbles, forest cultures, arable land, pastures, etc.The main forest-forming species are larch, pine, cedar, birch, and aspen [36].
The land use map of the study area was created on the basis of the Landsat 8/OLI medium spatial resolution image obtained on August 21, 2015 (path/rows 7891 × 7791) [37].The image had 11 bands with different wavelength ranges.Six spectral bands were used for analysis (Table 1).Since this image was taken under the cloudless conditions, it was an excellent source for classifying the types of vegetation of the Gilbirinskiy forestry and the surrounding area.The map of forest quarters was used as background data.

Image Processing Technique.
The radiometric calibration was the first important step in Landsat 8 data processing.The Landsat 8 data set consists of normalized values called a digital number (DN) that represents multispectral image data.These numbers have no physical meaning.They may also be completely incompatible in each independent data set and even in different bands of the same set.DN values for digital processing of Landsat 8 image are usually converted into one of two physical parameters, reflectance or spectral radiance [38].
It was chosen to convert DN pixel values into Top of Atmosphere (TOA) reflectance for data processing.These adjustments were made based on the coefficients presented in the metadata file (MTL.txt) that comes with the image set [39].The conversion of one unit to another was carried out using the radiometric calibration tool i.landsat.toar of the GRASS software [40,41].
The atmospheric correction was the next step after the radiometric calibration.The data on the underlying surface and the objects upon it obtained by the sensor system were  distorted due to the influence of many factors, among which the main one is the atmosphere [42].
There are various algorithms to perform the atmospheric correction such as standard absolute correction, standard relative correction, and corrections based on specified models [43].This research was focused on the method that used the atmospheric model [44][45][46][47].The main atmospheric effects taken into account included absorption by water vapour, oxygen, ozone, and carbon dioxide and scattering on aerosols and molecules.The input parameters in this model were the geometry of the sun and sensor location, atmospheric model for gaseous components, aerosol model (type and concentration), optical thickness of the atmosphere, surface reflection coefficient, and spectral bands.The parameters for the model were received from the metafile supplied with the raster image.Peculiarities of each image shooting were taken into account.
Popular models for the atmospheric correction are the Fast Line of Spectral Hypercubes (FLAASH), implemented in the ENVI software and Second Simulation of a Satellite Signal in the Solar Spectrum (6S) (Figure 2).This model is available in the GRASS software [40].Model 6S is located on the free access online platform that allows designing the model and downloading it as a file [48].
The atmospheric correction was performed with the i.atcorr tool available in the GRASS software [49].The tool input data consisted of the configuration file of a specific band and the image of this band.Each spectral band was processed separately.The central coordinates of the image were latitude -51 ∘ 41  34.94  N, longitude -107 ∘ 05  35.41  E. The exposure time as well as the location was indicated on the source site [50] and in the metadata.The time was given on the source site as a set of numbers "2015: 233: 03: 45: 26.9418490" in the format: "year: day in year: hours: minutes: decimal seconds", while the time alone was given in the SCENE CENTER TIME column of the metadata file.
Since the image shooting was performed in August, the "atmospheric model" was chosen as the parameter: average latitudes (summer).The aerosol model depends on the latitude of the study area.In this case the continental model was used.The average height above sea level was obtained on the basis of the Shuttle Radar Topography Mission (SRTM) file [51] for the study area using the r.univar module of the GRASS software.Values of parameters for atmospheric and aerosol models were selected from standard options.Since an optical visibility index obtained on the ground was not available for the area, the value for the model of aerosol concentration was set to 0. The optical thickness parameter was taken from the site [52].The sensor was on board the satellite so the code indicating the position of the sensor is -1000.The last parameter was the number of the spectral band selected for processing and was set for Landsat 8/OLI in following manner B2 -116, B3 -117, B4 -118, B5 -120, B6 -122, B7 -123.

Image Classification.
The image was classified after preprocessing stage.Classification is the process of sorting (distributing by classes) of image elements (pixels) into a finite number of classes based on the values of their attributes.If a pixel satisfies a certain classification condition, it belongs to a certain class that corresponds to this condition.
During the classification, informational and spectral classes were distinguished.Information classes are those objects that need to be recognized in the picture: various types of vegetation, water surface, land types, etc.The spectral class is a group of pixels that have approximately the same brightness in a certain spectral range.The task of classification is to compare the spectral classes with the information ones [53,54].
There are two main approaches to the classification, supervised (with training) and unsupervised (without training).Classification with training is the process by which the  brightness value of each pixel is compared with the standards.Each pixel belongs to the most appropriate class of objects.Classification without training is a process characterized by the automatic distribution of image pixels.This method is based on an analysis of the statistical brightness distribution of pixels.A combination of different classification methods is also used, which gives optimal results, especially for large data sets [55].
In this work, the unsupervised classification method was used.The two most common classification algorithms without training were used (namely, k-means and ISODATA [3]).The method of classification without training Iterative Self-Organizing Data Analysis Technique (ISODATA) is a process that is based on the cluster analysis.One class includes pixels which brightness values are closest in the space of spectral features [55,56].The idea of the k-means algorithm is to minimize the distances between objects in clusters based on their distance to the centre of mass [55,56].During classifying of satellite images such objects are pixels with their values in n-dimensional space (n is the number of spectral bands).It is necessary to specify two required parameters to start the algorithm: number of classes (clusters) and condition to stop the algorithm.The number of iterations (repetitions of the procedure that distributes objects into classes) or the parameter of changing the barycenter position can be used as a condition to stop the algorithm (Figure 3).
The unsupervised classification of image pixels by the kmeans method was used in accordance with the reflectance of each of them in the spectrum ranges.All clustering operations were performed in the ENVI software.
Two types of territories (areas with or without vegetation cover) were identified after the initial classification of the image.Then reclassification was carried out using a mask created on the basis of the territory with vegetation in the ENVI software.The level of forest fire danger was determined by the type of vegetation.Therefore, a new classification was required.
It has been shown that the most informative components can be used [57].The number of main components is equal to the number of channels processed.A colour image of the first three main components can be created for visualization.Usually, the first two or three components are able to describe almost the entire variability of the spectral characteristics.The remaining components are most often subject to noise.By discarding these components (in our case 4, 5, and 6), the amount of data can be reduced without noticeable loss of information.

Methods to Assess Forest Fire Danger Caused by Vegetation
Types.Previously, it was proposed to estimate the forest fire danger using reference ignition source [16].Focused solar radiation was used as the source in that paper.However, it is possible to generalize the method to the case of forest fuels exposure to a radiant heat flux including heat flux from the core of the tree trunk to the surface during lightning activity, heat flux from a heated mass or particles during anthropogenic load, and heat flux from the fire front during the propagation of burning over the forested territory or the occurrence of massive forest fires).The forest fire danger from exposure to radiant heat flux can be estimated using analysis of the vegetation type over particular forest area.This allows classifying the forest according to the level of fire danger of forest fuel.
It is necessary to describe the procedure for the gradual elimination of small areas that do not represent a significant fire danger.Initially, it is required to exclude sites that belong to the road network of the quarter.As there are no forest fuels in these sites, ignition is impossible in them.Other areas of the forested area are considered subject to fire danger.The   next stage is the elimination of areas with water bodies and water-saturated marshes, since either there are no forest fuels, or the moisture content is high.Then, territories located in the lowlands should be excluded from further consideration.Furthermore, sites with deciduous forest are excluded from consideration, because it is known that the probability of forest fire occurrence is low in deciduous forests.Then, areas with mixed forest can similarly be ignored.In such forest stands, layers of forest fuel are formed mainly from leaves in the upper layer, and needles are separated from each other and at some distance from the surface.Individual needles do not burn with open flame, they only smoulder.Afterwards, relatively young coniferous forest stands are regarded as nonpossessing notable fire danger and can be excluded from consideration.After a series of consecutive iterations, a map with the highest fire danger areas (old coniferous forest) is received.
The probability of forest fire occurrence in vegetation conditions when forest fuel is exposed to radiant heat flux was determined as follows for a particular forest quarter: where   is the probability of forest fire occurrence caused by forest conditions,   (FS -fire sites) is the number of fire danger sites in the quarter,   (TS -total number of sites) is the total number of sites in the quarter,   is the area occupied by coniferous vegetation in the quarter, and S TS is the total area of the forest quarter.The level of forest fire danger was determined by the calculated value of the probability of a fire caused by the influence of the radiant flux.A total of five categories were used to determine the level of forest fire danger, from emergency to low.Elimination of forest areas with low fire danger, assessment of probability of the forest fires caused by the radiant heat flux and determination of the forest fire danger level were implemented in the ArcGIS Desktop software [58] using the forest taxation data and the vegetation type.The data on the composition of vegetation are either terrestrial taxation descriptions or the results of a satellite image processing.Operations were performed using the built model, in which the geoprocessing tools were connected to each other in sequence.The script was written in Python on the basis of standard processing tools [59,60].The algorithm is shown in Figure 5.The result of all operations was a thematic map showing the fire danger levels of forest quarters.
The general scheme has been developed to estimate forest fire danger based on a comprehensive data analysis according to forest conditions using satellite image interpretation (Figure 6).

Results and Discussion
The first stage was the satellite image processing using free software GRASS GIS.The correction parameters were defined and set, and the radiometric calibration and the

Add Field FireDanger
Table-1   atmospheric correction were implemented.The dimensionless DN values characterizing the pixels of the image were converted into reflectance corrected for the local solar zenith angle [39].Further atmospheric corrections were made for each spectral band.Model 6S was used.Corrected spectral band files were obtained as a result of processing (Figure 7).
A multiband image was formed from corrected rasters (from 2 to 7 bands).The image was used for the classification of spectral characteristics into separate groups.The area of interest on the multiband raster was selected (which includes the study area) using the Region of interest (ROI) tool of the ENVI software and cropping by area of interest (Figure 8).
The results of the classification are shown in Figure 9. Classification parameters, the number of clusters, the number of iterations, and the threshold for changing the position of the centre of mass, were determined on the basis of visual interpretation of the image and the literature data [2,3].In order to make it easier to separate surfaces with similar reflectance, a sufficiently large number of classes were chosen, 15.The maximum possible number of iterations was set to 25.The threshold for changing the position of the centre of mass was taken 5%.For all classes in each channel, the average reflectivity values were calculated and spectral signatures were constructed (Table 2; Figure 9).Figure 11 shows the result of an unsupervised classification that includes all classes of the study area.In our case, these are water surfaces, various types of vegetation, and soil.The type of surface for a particular class is determined by spectral responses, i.e., by the shape (extremes) of the spectral signature.For   example, water surfaces have lower reflectivity in the classes 4-6 than other surfaces.The characteristic features of the vegetation spectral curves are high values of reflectivity in the near infrared range and lower values in red and in shortwave infrared (Short Wavelength Infrared, SWIR) [61].The classes 2 to 6 and 8 to 11 with a pronounced peak in the near infrared wavelength range (Band5) correspond to this criterion.For other classes that have a high reflectivity value in Band 5 (7; 12-15), the value in Band 6 is also quite large (Figure 9; Table 2).Spectral signatures with such characteristic peaks correspond to other types of terrestrial surface.The principal components of the spectral responses were calculated using the created vegetation mask, which increased the information amount.Six components were selected (Figure 10).The first two components are fairly representative, as they contain most of the variability of the data.Their values were used as the input in the new classification.
Figure 11 shows the spatial change in the values of the main components in the territory of the Gilbirinskiy forestry and the adjacent territory.
The classification of the first two components of the pixel reflectance values allowed a more detailed separation of the surface with vegetation.This made it possible not only to separate vegetation from other surfaces, but also to differentiate the vegetation by type.The division into classes was based on the values of spectral signatures (Figure 12) and visual interpretation (Figures 7 and 8).During the active process of photosynthesis, vegetation absorbs radiation in the visible spectrum and significantly reflects in the near infrared.For visual interpretation, classical colour compositions from spectral bands 4, 3, 2 (natural colours) 5, 4, and 3 (vegetation red) were used.The colour composition in natural colours  allowed visually identifying various types of surfaces, as well as verifying the correctness of the classification.The second colour composition gives an idea of the vegetation type.The brighter the red colour is, the more active the process of photosynthesis takes place in the area.Thus, for example, coniferous vegetation has darker shades, and cultural vegetation is brighter.
The following division into classes was obtained after reclassification: 1, 2, 4, coniferous forest; 5, 7, deciduous forest; 6, lands partially covered with vegetation; 3, deadwood; 8, cultural vegetation and grass.Then the rest of the lands were classified.The mask was created and the analysis of the main components was carried out.Then classification was implemented.All actions were similar to the previous ones, so another classified part of the image can be obtained.
Reclassification was done to combine the same classes into one in order to obtain the final image.Thus, the following categories of land (Figure 13) were identified on the territory of the Gilbirinskiy forestry as a result of the interpretation of the space image: water bodies, plantless ground, partially green soils, and deadwood; cultivated vegetation; deciduous forest; coniferous forest.Classification of Forests by the Level of Fire Danger.The decoding results were converted into a vector view (Figure 14) and combined with taxation units (quarters) using the ArcGIS tools.The areas occupied by various land categories were defined for the territory of the forest stand.The classification of forest areas according to the level of fire danger caused by thermal radiation was carried out (Figure 15).The level of fire danger was determined by the calculated value of the probability of a fire caused by the action of the radiant flux.A total of five categories were used to determine the level of forest fire danger: extreme, high, mean, moderate, and low levels [16].The possibilities to estimate fire danger from the data on the vegetation and land use were demonstrated using the satellite image.The method was applied to the Gilbirinskiy forestry of the Republic of Buryatia.

Mathematical Problems in Engineering
The paper discusses the method of pretreatment and correction of Landsat 8 satellite images, which can be applied almost without changes to images of the area obtained at another time.Based on the corrected data, classes were identified and compared with the corresponding type of territory by spectral responses.Fire danger estimation was carried out according to the vectorised classified data.The fire danger estimation approach developed by the authors was applied to each separate quarter and showed realistic results.The method used may be applicable for other areas in which forest vegetation prevails.
The obtained land use maps and the fire danger map can serve to prevent fires and their consequences in the territory of the Gilbirinskiy forestry.Decision making based on map data is a tool used in various industries.Given the characteristics and speed of change of forest areas, it can be noted that the composed maps will be relevant for a long period of time.This is due to the low rate of change in the area of forest plantations: the growth of coniferous trees and cutting down in a specially protected natural zone.

Conclusion
Summing up, the approach has been developed to estimate the forest fire danger by vegetation type with a radiant heat flux as the reference source of ignition.It should be noted that the developed method can be used to estimate forest fire danger both from thunderstorm activity and from anthropogenic factors.In the first case, the heat flux from the cloudto-ground lightning discharge channel, which is in contact with deciduous or coniferous trees, should be estimated.In a number of situations, direct contact of a lightning channel with a layer of ground forest fuel is possible, when the electric discharge current spreads in the ground cover and upper soil layer.In the case of an anthropogenic ignition, situations must be distinguished when it is necessary to estimate the radiant heat flux from a massive heated body or a local heating source (particles heated to high temperatures or firebrands).In addition, situation is possible when concentrated solar radiation focused by glass containers or their fragments is affected to a forest fuel layer [62].
In general, the types of vegetation in the Gilbirinskiy forestry were obtained by the computerized classification and the satellite image interpretation using the ENVI, GRASS, and ArcGIS software.The proposed approach allows estimating the localization and composition of vegetation, as well as identifying the areas with highest fire danger.Thus, for the marked areas with a high level of risk, it is recommended to conduct further detailed studies, which should include study of the territory on the ground and creation of reference zones.It is also necessary to develop recommendations for reforestation and minimizing forest fire risks.The technique was used for typical forestry of the Republic of Buryatia and can be applied to a large part of the territory belonging to the natural zone of Lake Baikal.

Figure 2 :
Figure 2: Scheme for the atmospheric correction (satellite module in the Solar Spectrum).

Figure 3 :
Figure 3: K-means algorithm applied to classification.

Figure 4 :
Figure 4: Diagram of a study area for selecting the most fire-prone sites with results from three initial iterations ((a)-(c)) and three final iterations ((d)-(f)) [16].

Figure 6 :
Figure 6: Forest fire danger assessment scheme based on satellite image interpretation.

Figure 7 :
Figure 7: Images in natural colours before and after the atmospheric correction.

Figure 9 :Figure 10 :
Figure 9: The result of the unsupervised classification of the Gilbirinskiy forestry and spectral class signatures (y-axis is the reflectance value, average for the class; x-axis is the spectral band).

Figure 13 :
Figure 13: Categories of land in the forestry and adjacent area.

Table 2 Table FireDange
PFigure 5: Model to classify forest according to forest fire danger level.

Table 2 :
Average values of reflectance.Class 2 Class 3 Class 4 Class 5 Class 6 Class 7 Class 8 Class 9 Class 10 Class 11 Class 12 Class 13 Class 14 Class 15 Figure 14: Areas of different types of vegetation within the Gilbirinskiy forestry.Figure 15: Map with forest fire danger level.