Analysis of Forest Fires by means of Pseudo Phase Plane and Multidimensional Scaling Methods

Forest fires dynamics is often characterized by the absence of a characteristic length-scale, long range correlations in space and time, and long memory, which are features also associated with fractional order systems. In this paper a public domain forest fires catalogue, containing information of events for Portugal, covering the period from 1980 up to 2012, is tackled. The events are modelled as time series of Dirac impulses with amplitude proportional to the burnt area. The time series are viewed as the system output and are interpreted as a manifestation of the system dynamics. In the first phase we use the pseudo phase plane (PPP) technique to describe forest fires dynamics. In the second phase we use multidimensional scaling (MDS) visualization tools. The PPP allows the representation of forest fires dynamics in two-dimensional space, by taking time series representative of the phenomena.TheMDS approach generates maps where objects that are perceived to be similar to each other are placed on the map forming clusters. The results are analysed in order to extract relationships among the data and to better understand forest fires behaviour.


Introduction
Forest fires, being caused by natural factors, human negligence, or human intent, consume every year vast areas of vegetation.Fire compromises ecosystems, has direct impact upon economy due to the destruction of property and infrastructures, raises the carbon dioxide emissions to the atmosphere, affects the water cycle, contributes to soil erosion, and has long-term economic implications associated with the climate change.In many regions and countries, like the United States, Australia, Russia, Brazil, China, and the Mediterranean Basin, fire is a major concern nowadays, demanding efficient policies for fire prevention and suppression and recovery of the affected areas.
Climate conditions, terrain orography, and type of vegetation are important factors that condition fire propagation and the total burnt area.The efficacy of detection and suppression strategies is fundamental in order to mitigate fire impact.However, fires caused by incendiaries contribute to increasing the complexity of the phenomena.Understanding forest fires behaviour and the underlying patterns in terms of fire size and spatiotemporal distributions may help the decision makers to take preventive measures beforehand, identifying possible hazards and deciding strategies for fire prevention, detection, and suppression.
In this paper we look at forest fires from the perspective of dynamical systems.A public domain forest fires catalogue containing data of events that occurred in Portugal, in the period from 1980 up to 2012, is tackled.The data is analysed in an annual basis, modelling the occurrences as sequences of Dirac impulses with amplitude proportional to the events.Therefore, we are not modelling the dynamics of each particular forest fire.Otherwise, we are describing the global fire dynamics along several decades.The time series are viewed as the output of a dynamical system and are interpreted as a manifestation of the system dynamics.In the first phase, we use the pseudo phase plane (PPP) technique.The optimal time delay for the PPP is determined by means of the autocorrelation function.The PPP portraits are compared using an appropriate metric and the results are visualized through phylogenetic trees, generated by hierarchical clustering algorithms.In the second phase, the multidimensional scaling (MDS) tools are adopted to compare and extract relationships among the data.
Having these ideas in mind, the paper is organized as follows.In Section 2 we briefly describe the forest fire catalogue used in this work.In Section 3 we address the problem by means of the PPP and visualization of trees generated by hierarchical clustering algorithms.In Section 4 we use the MDS method.The approach is applied to the data and the main results are interpreted and analysed.Finally, in Section 5, we outline the main conclusions.

Forest Fires Dataset
Data of forest fires collected at the Portuguese Institute of Nature and Forest Conservation (INCF), available online at http://www.icnf.pt/portal/florestas/dfci/inc/estatisticas, is used [19].The INCF dataset contains events since 1980 and up to 2012.Ignitions might have different sources, as natural causes, human negligence, or human intentionality, among others.The data was retrieved in December, 2013.Each data record contains information about the events date, time (with one minute resolution), geographic location, and size (in terms of burnt area).We discard small size events, as those are prone to measurement errors.Moreover, some small events may be missing because probably they were not reported.For that purpose we adopt a cutoff threshold value of 10 hectares for the burnt area.Experiments showed this value as a good trade-off between catalogue completeness and results accuracy.
The evolution of the burnt area and the number of occurrences are depicted in Figures 1 and 2, respectively.In Figure 3 we depict the Lorenz curve relating to the cumulative  burnt area and the cumulative number of events.The Gini coefficient, given by the double of the Gini area, measures the inequality among values of burnt area, being equal to 0.5968.The time series representative of the occurrences is shown in Figure 4, where we can note the yearly periodicity of the events, with the peaks of burnt area occurring in summer.During the period covered by the catalogue, stronger fire activity has been verified around the middle of the decade 2000-2009.
Using the Fourier transform (FT), the forest fires data is analyzed in the frequency-domain.For each annual time series (33 in total) the amplitude spectra are computed and approximated by a power law (PL) function.The PL parameters are interpreted as the signature of the system dynamics.For example, Figure 5 depicts the amplitude spectra for year   In this case, the PL approximation is |FT 2000 | = 2.28 × 10 5  −0.17 , unveiling fractional order characteristics.However, the FT characterizes the global dynamics and may not constitute the best tool to depict the time-varying artifacts present in response of complex system.This means that different approaches are needed to better understand forest fires.

Analysis of Forest Fires by means of PPP
The PPP is a particular case of the pseudo phase space (PPS), which is justified by Takens' embedding theorem [20].The PPS allows the representation of system dynamics in a higher dimensional space, by taking a smaller sample of signals representing measurements of the system time history [21][22][23].The PPS is useful in analysing signals with nonlinear behaviour and systems where complete information about all system states is unavailable.When compared to the classical phase space technique, the PPS reconstruction has the advantage of being more robust to signal noise.
In this section we analyse forest fires in an annual basis, representing the events of the th year ( = 1980, . . ., 2012) by leading to 33 one-year length time series.This means that the events are modelled as Dirac impulses, where   represents fire size,   is the instant of occurrence, parameter  represents time, and  is the total time length, in minutes, corresponding to year .
The signals   () are then normalized according to the following equation: where  and  represent the global mean and standard deviation values, that is, the values calculated for the whole set of events registered during the time period 1980-2012, with minimum magnitude equal to 10 ha.
To implement the PPP, we firstly integrate the signals x For each case, values within the interval  ∈ [1440, 288000] minute (i.e.,  ∈ [1,200] days) are tested and the optimal time delay,    , is computed, corresponding to the time at which the correlation function has its first point of inflection.
Figure 6 depicts, for example, the signals x2000 () and X2000 (), representative of the events that occurred in year 2000.The normalized time series, x2000 (), reveals a "noisy" nature, while the corresponding integral, X2000 (), is much smoother, showing more clearly possible correlations between multiple points.The larger discontinuities observed in the amplitude correspond to instants of sudden increase in fire activity.
In Figure 7, the correlation function for year  = 2000,   [ X2000 (), X2000 ( − )] versus the time delay,  is presented.In this case, the optimal time delay yields   2000 = 106 days.The optimal time delays calculated for the 33 one-year length time series are summarized in Table 1.
Figure 8 gives a global perspective of the PPP portraits, X (−   ) versus X (), for the 33 time series.Figure 9, serving   as an example, details the results obtained for the initial and final time series, that is, years 1980 and 2012, respectively.Both figures reveal complex patterns that resemble those found in chaotic systems, demonstrating the rich dynamics of forest fires.
Figure 11 unveils groups of objects (years) in such a way that objects in the same group (cluster) are more similar to each other than to those in other groups.11 leads to a result that is easier to interpret, as it identifies groups of objects that are similar, while Figure 10 just maps similarities between pairs of objects.

MDS Analysis and Visualization
In this section we adopt the MDS tool to visualize the relationships between forest fires events.An appropriate metric is proposed and the generated MDS graphs are analysed.
The MDS is a statistical technique for visualizing data.The MDS approach generates maps where objects that are perceived to be similar to each other are placed on the map forming clusters.The maps are indeterminate with respect to translation, rotation, and reflection and the axes have no special meaning.The algorithm requires the definition of a similarity measure (or, inversely, of a distance) and the construction of a  ×  symmetric matrix of similarities (or distances) between each pair of  objects.MDS reproduces the observed similarities by assigning a point to each object in a -dimensional space.For  = 2 or  = 3 dimensions the points may be displayed on a "map" [27][28][29][30][31][32][33].
We adopt the 33 × 33 similarity matrix E = [  ], defined by (5).The MDS map for  = 3 is depicted in Figure 12.A shorter (larger) distance between two points on the map means that the corresponding objects are more  similar (distinct).Figures 13 and 14 depict the Shepard and stress plots, respectively, that assess the quality of the MDS maps.The Shepard diagram shows an acceptable distribution of points around the 45 degree line, which means a good fit of the distances to the dissimilarities.On the other hand, the stress plot reveals that a three dimensional space well describes the locus of the points.Often, the maximum curvature point of the stress line is adopted as the criterion for deciding the dimensionality of the MDS map.
The MDS map of Figure 12 exposes the clusters that were previously identified by the hierarchical clustering (Figure 11).Comparing the MDS maps and the visualization trees, we conclude that both allow easy interpretation of the results and that there is no multiannual pattern.The MDS maps have the advantage of being more intuitive, mainly when dealing with a large number of objects.

Conclusion
This paper analysed forest fires data, adopting tools normally used in dynamical systems analysis.The data consisted in a public domain forest fires catalogue, containing information for Portugal and covering 33 years during the period 1980-2012.The events were modelled as time series of Dirac impulses with amplitude proportional to the burnt area.The data was analysed in an annual basis using the PPP and MDS tools.The PPP was used to model forest fires dynamics.Based on an appropriate correlation index, the MDS was adopted to compare annual patterns.Those tools allow different perspectives over forest fires that may be used to better understand such a complexity phenomenon.

2 MathematicalFigure 1 :
Figure 1: Yearly evolution of the burnt area, corresponding to forest fires registered in Portugal in the time period 1980-2012 (they are considered events with burnt area equal to or greater than 10 ha).

Figure 2 :
Figure 2: Yearly evolution of the number of forest fires, registered in Portugal in the time period 1980-2012 (they are considered events with burnt area equal to or greater than 10 ha).

Figure 3 :
Figure 3: Lorenz curve corresponding to forest fires registered in Portugal in the time period 1980-2012 (they are considered events with burnt area equal to or greater than 10 ha).

Figure 4 :
Figure 4: Burnt area versus time of the occurrences registered in Portugal in the time period 1980-2012, with burnt area equal to or greater than 10 ha.

Figure 5 :
Figure 5: Amplitude spectra, |FT 2000 |, of the time series corresponding to year 2000 and PL approximation.

Figure 7 :Figure 8 :
Figure 7: Correlation,   , as a function of the time delay, , corresponding to year 2000.

Figure 9 :Figure 10 :
Figure 9: The PPP portraits for the initial and final time series, years (a) 1980 and (b) 2012.

Figure 14 :
Figure 14: Stress plot for the 33 time series and similarity index   .

Table 1 :
Optimal time delays for the 33 time series during the period 1980-2012.Year    (days)   (   )