Terrain Segmentation of Greece Using the Spatial and Seasonal Variation of Reference Crop Evapotranspiration

1Department of Life Sciences and Biotechnology, University of Ferrara, Via L. Borsari 46, 44121 Ferrara, Italy 2Environmental Conservation & Management, Faculty of Pure and Applied Sciences, Open University of Cyprus, Latsia, P.O. Box 12794, 2252 Nicosia, Cyprus 3Department of Hydraulics, Soil Science and Agricultural Engineering, Aristotle University of Thessaloniki, University Campus, 54124 Thessaloniki, Greece


Introduction
The evapotranspiration rate, which can be achieved under no water restrictions from a well-watered ideal grass surface, is called reference crop evapotranspiration ET  and it is one of the most important hydroclimatic parameters for the implementation of various hydrological and agricultural applications [1].Various methods have been developed for the assessment of ET  [2,3] and the ASCE-standardized Penman-Monteith method, which is an update of the FAO-56 Penman-Monteith, has been proposed by the ASCE-EWRI Task Committee as the most precise method for ET  estimations [2].
The intra-annual and spatial variability of the ET  for regions of Greece has been investigated using different methods by many authors [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18].The latest and the most detailed mean monthly ET  estimations at country scale based on FAO Penman-Monteith method [1] have been performed by Dalezios et al. [5] using 66 meteorological stations covering a period of 15 years.The authors used the kriging technique for the development of 50 km resolution grids.Mardikis et al. [11] tested different interpolation methods for ET  based on 93 meteorological stations and provided improved methods which include the effect of elevation.
The last years, the use of climatic models succeeded to generate high resolution grids of various climatic parameters which enabled the development of global ET  maps.Significant works of long-term mean ET  estimations at global scales have been performed by many authors [19][20][21][22].ET  maps using the FAO-56 Penman-Monteith and ASCE-standardized Penman-Monteith method, which are the most demanding methods in terms of climatic data and Advances in Meteorology elaboration, have been developed at resolutions of 10 arc-min [20] and 0.5 degrees [21] (1 degree = 60 arc-min = 3600 arcsec ≈ 111 km at equator).The highest spatial resolution ET  maps (30 arc-sec) have been developed [22] using the simple method of Hargreaves [23], which requires only temperature data.
Grids of mean monthly ET  can be used in an attempt to capture the spatial and seasonal patterns and to segment the terrain to regions with distinct seasonal ET  variability.Common approaches for such analysis are the typical multivariate methods such as correlation analysis, principal components, factorial kriging analysis, multi-Gaussian co-kriging, regression analysis, and so forth, which have successfully been used for other environmental quality parameters [24][25][26][27][28][29][30].In addition, terrain segmentation techniques have been applied to multitemporal land surface temperature (LST) data in an attempt to define subregions with different seasonal LST variability, to assess its sensitivity to climatic change by revealing thermal anomalies and to support environmental analysis [31][32][33][34][35][36].Similar regionalization and segmentation approaches have also been applied successfully in order to define subregions with different seasonal precipitation patterns [37][38][39].
Objectives of the study are (a) to develop high resolution grids (30 arc-sec) of 50-years mean monthly ET  for Greece using the ASCE-standardized Penman-Monteith, (b) to analyze the spatiotemporal variability of ET  , and (c) to provide a terrain segmentation scheme for Greece based on the spatiotemporal variation of ET  .The techniques of cross-correlation, principal components (PCA), cluster analysis, and general linear models (GLMs) were used.Cross-correlation and PCA are used to capture the dependence of spatiotemporal variation of mean monthly ET  on geographical attributes (latitude and longitude), topography (elevation), and seasonality.-means clustering is used to segment the Greek territory to regions/clusters in which the mean monthly ET  observations approximate to representative mean values with similar temporal variability.GLMs are used to (a) to parameterize the effect of clusters and together with elevation, latitude, and longitude to build models which can provide estimations of mean monthly ET  (b) to verify the robustness of the cluster analysis and (c) to assess the contribution/effect of each cluster in the mean monthly ET  estimations.This effort will be valuable in assisting environmental decision support and agricultural planning in relation to water resources management.

Study Area and Data.
The study area is Greece (South-East Europe) which is confined between the 34 ∘ and 42 ∘ parallel N., with a meridional extent from 19 ∘ to 28 ∘ E. Greece has a typical Mediterranean climate: relatively cold and rainy winters, relatively warm and dry summers, and, generally, extended periods of sunshine.The spatial heterogeneity of climate is mainly attributed to the mountain range of Pindos located in the central part of Greece, while significant differences in the winter's severity are observed between the central-north continental territory and the islands where in the second case the winter is milder (Hellenic National Meteorological Service: http://www.hnms.gr/hnms/english/climatology/climatology html).
The analysis was based on climatic data, which were obtained from the following databases: (i) [40]: this database provides mean monthly values for the parameters of maximum, minimum, and mean temperature at 30 arc-sec spatial resolution (∼1 × 1 km).The data are provided as grids of mean monthly values of the period 1950-2000 (http://www.worldclim.org/).The database also includes a revised version of the GTOPO30 DEM based on SRTM DEM at 30 arc-sec spatial resolution.The data were produced using observation-based datasets after elaboration with the thin-plate smoothing spline algorithm implemented in the ANUSPLIN package for interpolation, using latitude, longitude, and elevation as independent variables.
(ii) [41]: this database provides mean monthly values of parameters such as solar radiation, specific humidity, wind speed at 10 m above surface (which was converted to 2 m above surface), precipitation, and temperature for the period 1948-2006 at 0.
where ET  is the daily reference crop evapotranspiration (mm d −1 ),   is the daily net radiation at the crop surface (MJ m −2 d −1 ),  2 is the mean daily wind speed at 2 m height above the soil surface (m s −1 ),  mean is the mean daily air temperature ( ∘ C),  is the daily soil heat flux density at the soil surface (MJ m −2 d −1 ),   is the mean daily saturation vapour pressure (kPa),   is the mean daily actual vapour pressure (kPa), Δ is the slope of the saturation vapour pressuretemperature curve (kPa ∘ C −1 ),  is the psychometric constant (kPa ∘ C −1 ),   and   are constants, which vary according to the time step and the reference crop type and describe the bulk surface resistance and aerodynamic roughness.The short reference crop corresponds to clipped grass of 12 cm height and surface resistance of 70 s m −1 where the constants   and   have the values 900 and 0.34, respectively.
The tall reference crop corresponds to full cover alfalfa of 50 cm height and surface resistance of 45 s m −1 , where the constants   and   have the values 1600 and 0.38, respectively [2].In this study, the ET  is estimated using (1) for the commonly used short reference crop as follows: (a) mean monthly values of maximum, minimum, and mean temperature were obtained from the database of [40], while (b) wind speed, specific humidity, and solar radiation were obtained from the database of [41].The specific humidity was converted to actual vapour pressure [42] before its use.The equations used for intercalculations in ASCE method are given in [2,43,44].All the calculations were performed in ArcGIS 9.3 ESRI environment.Twelve rasters (one for each month) of mean monthly ET  plus one of mean annual ET  of the period 1950-2000 were developed for the Greek territory.

Methodology for Terrain Segmentation Based on the Annual and Seasonal Variation of a Hydroclimatic Parameter.
A set of four statistical methods such as cross-correlation analysis, principal component analysis, -means cluster analysis, and general linear models [45][46][47][48] are used in this study.Correlation analysis and principal component analysis (PCA) reveal the temporal and spatial pattern evident within a multitemporal dataset.PCA is a linear transformation technique that produces a set of images known as principal components PCs that are uncorrelated with one another while they are ordered in terms of the amount of variance they explain from the original image set [49,50].PCs are computed from the linear combination of eigenvectors and the corresponding pixel values of the initial images [51].
PCA has traditionally been used in remote sensing as a means of data compaction since it is usually observed that the first 2 or 3 components are able to explain the majority of the variability in data values.Later components thus tend to be dominated by noise effects.By rejecting these later components, the volume of data is reduced, with no appreciable loss of information [52].Standardized principal components analysis [53] is applied (data per month is centered with mean 0.0 and standard deviation 1.0) and so each image is not weighed according to its variance.-means cluster analysis was used to partition the multitemporal (12-dimensional) imagery of ET  into  exclusive clusters.It begins by initializing cluster centroids, then assigns each pixel to the cluster whose centroid is nearest, updates the cluster centroids, and then repeats the process until the stopping criteria are satisfied [51].The analysis uses Euclidian distance for calculating the distances between pixels and cluster centroids.The underlying idea of cluster analysis is that the cluster centroids represent the mean expression of the derived clusters.So clustering of the multitemporal data sets is expected to define groups of pixels with a rather common centroid curve that expresses their average monthly variability [32].Elevation, latitude, and longitude statistics per cluster are computed in order to assist interpretation.Finally the clusters were interpreted according to their centroid and their spatial arrangement [36].
General multiple linear regression analysis was performed using Statgraphics Centurion software (StatPoint Technologies).The parameterization of segmentation effects was made using an integer cluster number CN ranging from 1 to .The CN categorical factor together with the quantitative factors of latitude (Lat), longitude (Lon), and elevation () were used as independent variables to describe the mean monthly values of ET  .The values of the aforementioned parameters were extracted from 29765 randomly selected positions homogeneously distributed in the entire Greek territory.The selection was performed using the Sampling Design Tool developed by NOAA (National Oceanic and Atmospheric Administration of USA) based on the procedure of "stratified random sampling."For each month, the mean values of the dependent variable (mean monthly ET  ) of sampling positions belonging to each cluster were compared using ANOVA-LSD.This analysis was performed in order to verify that the sampling procedure retained the differences between clusters, which were derived by -means using the whole number of pixels of each cluster.Square root transformation was used for the Lat, Lon, and  values while BoxCox transformation [54] was used for the mean monthly ET  values to avoid/reduce normality deviations of the dependent and independent variables.The selection of the parameters for the regression model was performed using their variance inflation factor (VIF) in order to avoid multi-co-linearity effects.VIF values above 10 are usually considered to indicate serious multicollinearity [45].Thus, the following linear model was built to describe the mean monthly ET  (mm) for each month: where  0 is intercept of the regression model and   is the regression coefficients of the categorical factor  CN which is regulated by the cluster number CN,  +1,+2,+3 : the regression coefficients of the quantitative factors Lon, Lat (decimal degrees), and  (m), respectively, and  is the month (starting with  = 1 for January and ending with  = 12 for December).The value of  is equal to the number of clusters minus 1 ( =  − 1) whereas the categorical factor  CN is regulated by the cluster number CN as follows: The general form of BoxCox transformation for the dependent variable ET  for each month  is given according to the following formula [54]: Elevation (m a.s.l.) High: 2710 Low: 1 1 : 10,000,000 0 75 150 300 (km) High: 1618 Low: 506 1 : 10,000,000 0 75 150 300 (km) where  1,2 is the power and shift parameters of BoxCox transformation, respectively, and  is the number of ET  observations (samples) for a month .Note that  2 is the geometric mean of ET  + 2 .In order to improve the efficiency of the model, the power parameter  1 was optimized while the shift parameter  2 was set to 0 based on the default optimization procedure incorporated in the statistical software.Τhe optimal transformation is the one that minimizes the mean squared error of the transformed dependent variable [54].
Outliers were not removed during the regression analysis because of the large number of data (there is no automatic procedure to remove outliers).Autocorrelation of the residuals was tested using the Durbin-Watson test [55,56], which provides values between 0 and 4 where the optimum value for no autocorrelation is equal to 2. ANOVA was used to estimate the statistical significance of the regression model while square correlation coefficient  2 adjusted for degrees of freedom was used to evaluate the explanatory power of the model.Type III sums of squares analysis was used to estimate the statistical significance of the independent variables CN, Lat, Lon, and .The ET  map (Figure 1(b)) indicates that the lowland areas (below 250 m) with a distance less than 20-25 km from the shoreline and with latitude less than 39 degrees present annual values of ET  > 1200 mm.Regions with extremely high mean annual ET  values >1400 mm occupy approximately 13.5% of the Greek territory and they are distributed mainly to lowlands along the coastlines of (a) Crete and Dodecanese islands (southern and eastern Aegean Sea), (b) eastern Attika, and (c) eastern Peloponnesus.These findings are in accordance to the study of [5] who used a 50 km resolution grid and the studies of [57,58] who analyzed data from meteorological stations.

Spatial and Seasonal Variation of Mean
High: 2.97096 Low: −3.59423 0 75 150 300 Figure 3: The first two principal components PC-1 and PC-2 that account for the 97.9% of the ET  variance.maximized during the winter.The absolute values of ET  -Lon correlations are generally low and they are maximized during the period of August-January.
PCA was implemented in order to interpret the monthly intercorrelations observed in Table 1 and the corresponding eigenvalues and eigenvectors are presented in Table 2.According to the eigenvalues, the first two principal components (PC-1 and PC-2) account for the 97.9% of the total variance observed in the 12 monthly ET  images.
The PC-1 map (Figure 3(a)) is composed of linear combinations of monthly ET  images which are not influenced by seasonality since the weights of eigenvectors slightly vary between 0.28 and 0.30 (Table 2).Taking into account the correlation coefficients of monthly ET  versus  and Lat from Table 1, it is easy to detect that PC-1 amplifies the effects of elevation and latitude.The effects of elevation and latitude indirectly express the effects of temperature, which is higher in the lowlands but also higher in lower latitudes.The PC-1 is spatially maximised in the lowlands of the southern continental Greece and in the islands.
For PC-2 map (Figure 3(b)), the linear combinations of monthly ET  images amplify the difference between the periods of October-March and April-September which present negative and positive weights of eigenvectors, respectively (Table 2).The difference between these two periods is spatially maximised in central Greece, in central Macedonia (Thessaloniki plain and Chalkidiki regions), in eastern Thrace, in the islands of north-eastern Aegean sea and south Ionian sea.These patterns are in accordance with the spatiotemporal patterns given by [5,8,59].

Terrain Segmentation Based on the Spatial and Seasonal
Variation of   .For the implementation of the -means method, the following criteria were used: (a) small clusters with area extent (occurrence) less than 0.001% were eliminated by merging them with larger clusters that are closest to their centroids and the stopping criterion was defined as the percentage of the migrating pixels during a specific iteration (if it was less than 0.001% of the entire image pixels the clustering was terminated), (b) the maximum number of clusters  was defined by a trial and error procedure, and (c) the maximum number of iterations was set equal to 300.At the end of the analysis, the mean monthly values of ET  are given for each cluster (described as multipart polygons).Cluster analysis revealed 11 clusters ( = 11) in which the mean monthly ET  observations approximate to representative mean values with similar temporal variability.The characteristics of their centroids (latitude, longitude), their percent area extent, mean elevation, and the variation of monthly ET  are given in Table 3.The spatial distribution of the 11 clusters is given in Figure 4.In general, ET  is maximized during summer and minimized during winter where the 76.7% of annual ET  in Greece is observed during April to September (warm-dry period), while the 23.3% is observed during October to March (coldwet period) (Table 3).Clusters of significant interest are the following: (i) Clusters 9, 10, and 11 present mean annual values of ET  > 1400 mm, cluster 9 presents the highest mean monthly ET  among all clusters during the period of May-September, and cluster 11 presents the highest mean monthly ET  among all clusters during the period of October-April.(ii) Clusters 3 and 4 present the lower mean annual values of ET  and correspond to the higher elevation areas of Greece.(iii) Clusters 1 and 5 cover the major part of the agricultural land in Greece.

Regression Models of Mean
Monthly   .The eleven clusters ( = 11) obtained by the -means analysis were used to calibrate (2) for each month using the observations of monthly ET  , Lon, Lat, , and CN values of the 29765 selected sampling positions.The sampling positions were distributed to each cluster based on the % cluster's area extent (Table 3).Their number within each cluster is given in Table 4.The comparison of the mean monthly values of the dependent variable (the BoxCox transformed mean monthly ET  ) of sampling positions belonging to each cluster based on ANOVA-LSD is also given in Table 4.This analysis verified that the sampling procedure retained the differences between clusters which were derived by -means taking into account the fact that only 4 out of 660 combinations of pairs comparisons were found nonstatistically different (e.g., pairs of clusters 3 and 4 for January, clusters 9 and 10 in June, clusters 7 and 9 in August, and clusters 9 and 10 in October) (Table 4).
The variance inflation factors VIFs of the selected independent variables of (2) ranged between 1.26 and 3.3 (the maximum VIF value corresponds to the transformed value of elevation).The regression coefficients and the statistical tests of general multiple regression analysis are given in Table 5 (twelve models, one for each month) whereas the respective graphs of observed versus predicted values of the BoxCox  transformed mean monthly ET  are given in Figure 5.For all models, type III sums of squares analysis showed that the independent variables CN, Lat, Lon, and  were statistically significant at 95% confidence level with  < 0.001, while the Durbin-Watson test indicated no serial autocorrelation of the residuals at 95% confidence level.Square correlation coefficients  2 ranged between 0.88 and 0.96 indicating satisfactory accuracy of the twelve models.
The analysis of the general linear models verified the robustness of the cluster analysis in two ways (a) by the statistically significant difference between the clusters, which was derived by the ANOVA-LSD (Table 4), and (b) by the high statistical significance of the CN parameter as independent variable in the models and their high square correlation coefficients.The contribution/effect of each cluster in the spatiotemporal variation of ET  in the Greek territory, as it  is described by the sum term inside (2), is given in Figure 6.
According to Figure 6, clusters 1, 2, 3, and 4 present a negative effect while clusters 6, 7, 9, 10, and 11 present a positive effect to all months.The effect of clusters 5 and 8 presents small positive values during the warm season and small negative values during the cold season.The contribution of each cluster (Figure 6) defines the relative behaviour of each cluster during each month using, as a reference, the mean monthly conditions of ET  in the Greek territory.

Conclusions
The derived high resolution dataset of mean monthly ET  and the statistical analysis captured the seasonal and spatial variation of ET  in Greece while the regional spatiotemporal trends were revealed and assigned to spatial clusters.More specifically, cross-correlation and principal component analysis quantified the temporal pattern of ET  and revealed season, latitude, longitude, and elevation dependencies, while -means cluster analysis segmented the terrain to regions with distinct temporal variability of ET  .The derived spatiotemporal patterns are in accordance with previous research efforts.
The parameterization of each cluster (revealing a unique temporal signature of monthly ET  ) together with topography (elevation) and spatial arrangement (latitude and longitude) described satisfactorily the mean monthly ET  through generalized linear models verifying the robustness of the cluster analysis, providing also a tool for ET  estimations for agricultural purposes.The feasible development of high resolution grids of monthly ET  based on available meteorological data produced by climatic models and the terrain segmentation approach can be used as tools to design decision support and management schemes of water recourses.
The procedures described in this study can be used as a general framework for the spatial and seasonal variation not only for ET  but also for other climatic parameters while it can support future research efforts such as (a) assessing any climate changes by integrating monthly ET  data sets or other climatic variables of the forecoming years; the new data sets can be used to recompute the temporal pattern of the climatic variable within each cluster in an attempt to evaluate any trends in the centroid change between two periods; (b) using the proposed framework at watershed scale in order to facilitate the surface integration of climatic variables obtained by meteorological stations based on clusters produced by long-term data such as the ones of this study.

Figure 1 :
Figure 1: Maps of (a) elevation  and (b) mean annual ET  for the period 1950-2000 estimated by the ASCE standardized Penman-Monteith method.

3. 1 .
Spatial Variation of Mean Annual   .The 30 arc-sec resolution (∼1 × 1 km) maps of elevation  and mean reference crop evapotranspiration ET  for the period 1950-2000 estimated by the ASCE standardized Penman-Monteith method are given in Figures 1(a) and 1(b), respectively, while the frequency distributions of pixel values are given, respectively, in Figures 2(a) and 2(b).

Figure 2 :
Figure 2: Frequency analysis of the spatial variation of (a) elevation  and (b) mean annual ET  estimated by the ASCE standardized Penman-Monteith method for the period 1950-2000.

Figure 4 :
Figure 4: The spatial distribution of the 11 clusters of ET  (on the bottom left corner of cluster figures are also given the CN values).

Figure 6 :
Figure6: The contribution/effect of each cluster in the spatiotemporal variation of ET  in the Greek territory as it is described by the sum term inside (2).
Monthly   .The correlation matrix of the mean monthly values of reference evapotranspiration ET  versus , Lat, and Lon is quantified in Table 1.According to Table 1, the mean monthly ET  presents negative correlations with  and Lat while a rather week positive correlation between ET  and Lon exists.The absolute values of ET  - correlations are maximized during the warm season while the ET  -Lat correlations are

Table 1 :
Cross-correlation matrix of the mean monthly ET  (January to December) as well as the latitude, longitude, and elevation images.Correlations are performed between pairs of variables taking values from pixels of the same geographical position.

Table 2 :
Eigenvalues and eigenvectors for the PCs of the mean monthly ET  .

Table 4 :
Number of sampling positions within each cluster and ANOVA-LSD comparisons of the mean values of the dependent variable between the 11 clusters for each month.
†Different letters indicate statistically significant differences between clusters of each month at 95% confidence level (ANOVA-LSD).

Table 5 :
(2)ression coefficients and statistical tests of the general linear model described by(2)for each month.