Soil moisture retrieval is one of the most challenging problems in the context of biophysical parameter estimation from remotely sensed data. Typically, microwave signals are used thanks to their sensitivity to variations in the water content of soil. However, especially in the Alps, the presence of vegetation and the heterogeneity of topography may significantly affect the microwave signal, thus increasing the complexity of the retrieval. In this paper, the effectiveness of RADARSAT2 SAR images for the estimation of soil moisture in an alpine catchment is investigated. We first carry out a sensitivity analysis of the SAR signal to the moisture content of soil and other target properties (e.g., topography and vegetation). Then we propose a technique for estimating soil moisture based on the Support Vector Regression algorithm and the integration of ancillary data. Preliminary results are discussed both in terms of accuracy over point measurements and effectiveness in handling spatially distributed data.
Soil moisture content is a key parameter in many hydrological processes. It controls the infiltration rate during precipitation events, runoff production, and evapotranspiration [
In the last few years, the increasing number of space-borne sensors, with complete and frequent coverage of the Earth’s surface, has determined an increasing interest for the estimation of bio-geophysical surface parameters from remotely sensed data. In this field, one of the most challenging problems is related to the estimation of soil moisture content from microwave sensors, in particular Synthetic Aperture Radars (SARs).
The sensitivity of microwave signals to the soil moisture content depends on the influence of water on the dielectric constant and has been well established in several studies [
Topography is another important aspect (in addition to the effects of vegetation and surface roughness) to be taken into consideration when dealing with the estimation of soil parameters. Satellite systems, in particular SAR systems, are strongly affected by the topography of the area. Distortion effects (i.e., foreshortening, layover, and shadowing) may occur due to the side-looking acquisition geometry (specific of the SAR sensor) and the presence of topography on the ground. Even if these extreme distortion effects do not occur, the SAR signal is affected by the local incidence angle and the distance between the target area and the sensor antenna. These topographic effects are usually taken into consideration during the calibration of the data. However, when dealing with mountain areas, such as the Alps, it is fair to expect to have a nonnegligible residual contribution within the signal due to the extreme topographic conditions [
From the methodological viewpoint, the retrieval of soil moisture content can be considered as a mapping problem from the space of the measured signal (i.e., the backscattering signal) to the space of the desired biophysical parameter (i.e., the soil moisture content). This task is commonly addressed by means of the inference of the desired mapping from theoretical forward models, such as the Integral Equation Model (IEM), with the use of iterative methods or nonlinear machine learning techniques [
All these aspects make the problem of the characterization of soil moisture in alpine areas from remotely sensed data extremely complex and challenging. With the prospective of the integration of soil moisture estimates in real applicative scenarios, like those cited above, it is important to have a clear comprehension of the possibilities, but also the limitations, of the new generation satellite SAR sensors in combination with advanced state-of-the-art methodologies for the retrieval of soil parameters in the Alpine environment. Although some works in this direction have started, further analysis is required. The SOFIA project (SOil and Forest Information retrieval with RADARSAT2 images) inserts in this context and aims at investigating the capability of new generation polarimetric RADARSAT2 satellite SAR sensors in combination with advanced state-of-the-art methods for the estimation of soil and forest biophysical parameters in the Alpine environment. This paper introduces the rationale behind the experimental analysis carried out in the context of the SOFIA project for the specific topic of soil moisture estimation. The main objectives of the proposed work are to present the test area and the setup for the ground measurements, to analyze the sensitivity of the RADARSAT2 polarimetric data on the soil moisture content in an Alpine catchment and the necessity to integrate SAR images with ancillary data, to present the first results of soil moisture estimation derived from the inversion procedure based on the Support Vector Regression technique.
The rest of the paper is organized as follows. Section
The study area of the SOFIA project is the Alto Adige Province, located in Northern Italy (see Figure
Study area of the SOFIA project: (a) Alto Adige Province and (b) Mazia Valley, with the localization of the fixed measurement stations. The stations called “Transect” are the most complete ones, including 4 soil water content sensors in each station at two depths (5 and 20 cm). The stations called “Catchment” include one soil water content sensor at two depths (5 and 20 cm).
Thus, Alto Adige represents an interesting test site for the following reasons: high vulnerability to climate change in fields highly connected to the projects objectives (drought, lack of water, natural hazards, yield), representativeness at least for the central and southern Alps, high diversity of land use with almost all types of land use of central European mountain areas, good data supply, good contact to partners and access to the results of several scientific projects.
Within the Alto Adige area, the Mazia valley (Figure
The valley is equipped with 16 fixed stations for the measurement and monitoring in time of soil parameters (moisture content at 5 and 20 cm depth) and meteorological data (air temperature and humidity, precipitation, wind speed and direction, solar radiation) [
During the summer of 2010, two images were acquired by RADARSAT 2 over the Mazia valley on 3rd June and 21st July. The sensor acquisition mode was Standard Quad Polarization, with a mean incidence angle of 45° and an ascending orbit. The acquisition geometry has been selected such that the area of interest, characterized by a highly variable topography, was imaged minimizing the layover and shadowing effects on the east side of the valley, where a higher number of field measurement stations are present. Original images were provided in single look complex (SLC) format with pixel size of 4.93 m and 17.48 m in azimuth and ground range directions, respectively. Thus the data have been multilooked, calibrated, and geocoded with the help of a high-geometrical resolution (2.5 meters) digital elevation model and filtered with a Frost filter (window size 5 × 5) in order to reduce the effect of speckle noise. The final resolution of the processed images is 20 m. All the preprocessing has been carried out with the SARscape software (
RADARSAT2 image acquired on July 21st, false color RGB composition (R = HH, G = HV, B = VV).
Contemporary to the satellite acquisitions, two field measurement campaigns have been carried out in the Mazia valley. The aim was to acquire information on the soil parameters (moisture content and roughness) and on the vegetation status (biomass and vegetation water content) of meadow and pasture areas. These measurements have been exploited during the project for different purposes: (1) the calibration of the fixed measurement stations located in the valley, in order to have consistent information at these locations also in correspondence to future satellite overpasses and acquisitions, (2) the analysis of the sensitivity of RADARSAT2 measurements to the properties of soils and vegetation in alpine areas, and (3) the development and validation of the algorithm for the estimation of the soil parameters from the satellite images.
Two different kinds of measurements have been performed: (1) destructive measurements of both vegetation and soil samples, by physically taking a sample of grass and soil. This kind of sampling was necessary to have accurate measurements of biomass, vegetation water content, soil gravimetric moisture, and bulk density. All the samples have been acquired, weighted, and then sealed in order to be dried in the laboratory according to standard measurement protocols [
Ranges of variability of the dielectric constant (real part) values measured during the field campaigns.
Meadow | Pasture | |||
June 2010 | July 2010 | June 2010 | July 2010 | |
Min dielectric constant value | 6.7 | 3.8 | 6.4 | 3.2 |
Max dielectric constant value | 23.2 | 27 | 17.7 | 8.7 |
Average dielectric constant value | 16.7 | 15.4 | 11.6 | 5.7 |
In this paper, we address the real part of dielectric constant because it represents the dielectric properties to which the SAR e.m. waves are particularly sensitive. The imaginary part of dielectric constant is in general very low and in most cases can be considered negligible [
To carry out the analysis presented in this work, ancillary data already available or extracted from satellite optical sensors have been considered. In greater detail, a digital elevation model (DEM) with high spatial resolution (2.5 m) obtained from the processing of airborne lidar acquisitions over the whole Alto Adige area during a measurement campaign in 2008, two normalized difference vegetation index (NDVI) maps extracted from two images acquired by the NASA MODIS sensor onboard the Terra satellite as close as possible to the RADARSAT2 satellite overpasses (i.e., within ±1 day from the RADARSAT2 acquisition). MODIS is a multispectral sensor with 36 spectral channels which acquires information in the visible and infrared portions of the spectrum with daily coverage of the whole Earth’s surface. The high temporal resolution of this system allows extracting useful information of the area of interest maximizing the probability to have cloud-free acquisitions as close as possible to the date of interest. The spatial resolution of the sensor is 250 m in the red and near-infrared bands, the portions of the spectrum considered for the computation of the NDVI values, a high-resolution (25 m) land-cover map of the Mazia valley derived from ortho-photos, ground surveys, and visual interpretation.
Ancillary data have been geocoded and resampled (bilinear convolution) in order to be completely superimposed with the RADARSAT2 images.
In order to understand the sensitivity of the RADARSAT2 signal to the moisture content of the investigated area, scatter plots of the backscattering coefficients at different polarization configurations versus the dielectric constant values were generated. To this purpose, in the two satellite images a small 3 × 3 pixels region was considered in correspondence of each field measurement point. Then the backscattering values were averaged and the resulting mean value was associated to the corresponding field measurement. Samples associated to foreshortening and layover areas were discarded from the analysis. Finally, considering both the acquisition dates and both meadow and pasture land cover types, 75 samples were used in the analysis. Figure
Scatter plots of backscattering coefficients extracted from the RADARSAT2 images versus dielectric constant measurements in the case of (a) HH polarization configuration and (b) HV polarization configuration.
From a first analysis, it is possible to observe that the points associated to meadows present an expected increasing trend versus the dielectric constant values (more evident in the case of the HH with respect to the HV polarization). On the contrary, no clear trend can be recognized in the samples associated to the pastures. In greater detail, these samples show a high level of ambiguity (i.e., samples with similar dielectric constant values present significant differences in terms of backscattering coefficients) especially for low dielectric constant values. As explained previously, different target properties and external factors may affect the microwave signal acquired by the satellite sensor. Taking into account the environmental conditions observed during the field measurement campaigns, two factors can be considered as mainly responsible for the variability and ambiguity observed in the pasture samples: (1) the topography and (2) the heterogeneity of the vegetation/land-cover. In the following, these two aspects are better investigated with the help of ancillary data, in order to understand if and to what extent they affect the RADARSAT2 measurements.
As explained previously, topography significantly affects the signal acquired by a satellite SAR system. In our case, although the calibration of the signal was carried out with the help of a detailed digital elevation model, residual topographic effects are expected to introduce significant ambiguity in the backscattering coefficients. This is expected especially for pastures, since they extend over large portions of the valley sides, with altitudes ranging from 1200 to 2400 meters. On the contrary, meadows are mainly located in the valley floor, thus they present similar topographic conditions.
In order to investigate the effect of topography on the backscattering signal, the digital elevation model has been exploited for the extraction of two topographic features: the local incidence angle of the SAR signal (i.e., the angle between the line of sight of the SAR sensor and the direction normal to the surface within the resolution cell, which takes into account the local topography of the area) and the local altitude. The samples associated to the pasture (which demonstrated the highest ambiguity in the SAR signal, as shown in Figure
Scatter plots of backscattering coefficients extracted from the RADARSAT2 images versus dielectric constant measurements over pasture areas and with dielectric constant values between 4.5 and 5.5 in the case of (a) HH polarization configuration and (b) HV polarization configuration. The samples are grouped into 4 clusters according to the topographic features extracted from the DEM.
In the plots, it is possible to observe that samples with similar characteristics in terms of altitude and local incidence angle are quite close one to each other and located in specific portions of the feature space. In greater detail, samples acquired in areas with low altitude and high local incidence angles of the SAR signal present the lowest values of the backscattering coefficient. On the contrary, samples associated to areas with high altitude and low local incidence angles are characterized by the highest backscattering coefficients. The difference between these two extreme topographic conditions is particularly enhanced and can be quantified in 8-9 dB for both HH and HV polarization configurations. The samples with intermediate topographic characteristics, that is, low altitude and low incidence angle and high altitude and high incidence angle, are located between these two extremes. It emerges that both the local incidence angle of the SAR signal and the local altitude of the investigated area affect the backscattering coefficient, introducing attenuation or increase of its value. However, a certain level of variability still remains in the data, as can be observed for example, in the cluster of samples associated to high-altitude and high-local incidence angle. This suggests that topography is not the only factor that affects the SAR signal in these environmental conditions.
As it was observed in the Mazia valley during field campaigns, the Alpine landscape is characterized by a high variability and heterogeneity in terms of vegetation/land-cover. Meadows, located in the valley floor, are intensively farmed and irrigated. The soil is typically homogeneous, flat in terms of roughness, and the grass is typically thick. Cut events during the summer period determine variations in the biomass of the vegetation coverage. Pastures have completely different characteristics. First of all, they are located on the sides of the valley where the terrain becomes steep and the altitude increases. The soil is heterogeneous, with the presence of stones and in some cases of large rock’s areas when the altitude becomes higher. Also the vegetation coverage is irregular, presenting areas with a significant presence of grass and others less vegetated or quite bare.
Vegetation influences the microwave signal by introducing an attenuation effect with respect to bare soils, as indicated in several studies [
Plots shown in Figure
Scatter plots of backscattering coefficients extracted from the RADARSAT2 images versus dielectric constant measurements over pasture areas and with dielectric constant values between 4.5 and 5.5 in the case of (a) HH polarization configuration and (b) HV polarization configuration. NDVI values are shown for the samples which show strong residual variability in the backscattering coefficient value.
The sensitivity analysis presented in this sections suggests that the backscattering coefficients measured by the RADARSAT2 SAR sensor are sensitive to variations in the dielectric constant of soils, thus to variations in the moisture content. However, the microwave signal is also strongly affected by the topography of the area (also after standard topographic correction) and the heterogeneity of the vegetation/land-cover. These factors should be properly taken into consideration for the retrieval of the moisture content of soils in presence of these challenging environmental conditions.
Due to the effect of topography and vegetation/land-cover heterogeneity on the SAR signal, the retrieval of soil moisture content in alpine areas becomes particularly challenging and complex. Estimation approaches based on the inversion of theoretical models may be not effective. Due to the high complexity and heterogeneity of the physical phenomena that affect the microwave signal, it is fair to expect that theoretical models (which introduce in their formulation several approximations and simplifications) will be not reliable and accurate in the estimation. In order to deal with this issue, a possible solution is the direct exploitation of the information contained in the data acquired during the field campaigns by means of nonlinear machine learning techniques. In particular, in this work we propose to address the estimation problem with the
Thanks to its formulation, SVR is able to handle complex nonlinear estimation problems with good intrinsic generalization capability also in presence of a limited number of training samples [
Let us consider a generic estimation problem. We would like to retrieve a continuous variable
Given a set of
The optimal linear function in the transformed feature space is selected minimizing a cost function, which is the combination of the training error (empirical risk) and the model complexity (structural risk). The first term is calculated according to a
Example of a possible choice of the
The constrained optimization problem in (
The retrieval process is divided into two phases: (1) the training of the SVR algorithm and (2) the estimation phase.
During the training, the available training samples (i.e., the measurements acquired during the field campaign associated to the corresponding values of the microwave signal extracted from the RADARSAT2 images) are provided to the technique in order to learn the underlying relationship between the input features and the output target value. Typically, the samples are divided into two subsets: the first is used as training and the second is used as validation to assess the estimation performance of the technique (in terms of accuracy or other quality metrics) with different configurations of the free model parameters. In our analysis, in order to avoid problems related to the choice of the training and validation sets, we applied a
After the regressor is trained, it is applied to the multi-dimensional image (which shall contain the same features considered during the training of the technique) in order to obtain the estimated moisture content map.
In our experiments, we considered a 5-fold for the cross validation procedure and the mean squared error (MSE) and the slope of the linear trend of estimated versus true target values as quality metrics to drive the multiobjective model selection. The optimal solution is selected on the basis of a visual inspection of the estimated Pareto front (i.e., the set of optimal solutions of the multiobjective model selection problem). Concerning the SVR technique, we selected an RBF Gaussian kernel and the following ranges for the model parameters: [10−3;103] for
As input features of the estimation system, we considered the four polarimetric configurations of the RADARSAT2 image: the altitude and the local incidence angle extracted from the DEM as topographic features and the NDVI and land-cover maps as features for the characterization of the vegetation/land-cover heterogeneity. Different experiments were carried out with different combinations of these features selected according to a sequential forward selection (SFS) strategy, in order to define the subset of them that provides the best results in terms of estimation accuracy.
From an operative viewpoint, for the implementation of the SVR algorithm, we considered the LibSVM software, freely available online [
In order to evaluate the estimation performance of the SVR algorithm, different quality metrics were considered: the mean squared error (MSE) (or equivalently the Root MSE (RMSE)), which provides an information on the average error over the estimates; the slope and intercept of the linear regression line between estimated and true target values, which indicate whether and to what extent the retrieval algorithm under- and overestimates the target variable with respect to the ideal case of a one-to-one line; the determination coefficient (R2), which provides a measure about the spread of the estimates around the linear regression line (in the ideal case of a one-to-one line, this metric equals one). These metrics were evaluated over the available reference samples according to the 5-fold cross validation scheme described before. As previously explained, different input feature configurations were considered in the experiments according to the SFS strategy. Here, due to space constraints, we show and discuss the case with the input feature configuration that provided the best performances, that is, the configuration containing 2 polarimetric features (HH and HV), the 2 topographic features (Altitude and Local Incidence Angle), the NDVI, and the land-cover map. Table
Estimation accuracies achieved by the proposed algorithm with the best input feature configuration.
Global | Meadow | Pasture | |
---|---|---|---|
RMSE | 2.68 | 4.05 | 1.68 |
R2 | 0.79 | 0.58 | 0.75 |
Slope | 0.78 | 0.58 | 0.7 |
Intercept | 2.26 | 7.15 | 2.3 |
Scatter plot of estimated versus measured dielectric constant values obtained with the proposed algorithm with the best input features configuration.
Globally, the achieved accuracies are promising, with an RMSE of 2.68 and a determination coefficients near to 0.8. Analyzing in more detail the results, it is possible to observe that the retrieval algorithm provides better performance over pastures with respect to meadows. In the latter case, the error is slightly higher and the algorithm tends to overestimate low values and underestimate high values of the dielectric constant. This effect is probably due to (1) the range of variability of the target variable, which is much larger in the case of meadows with respect to pastures and (2) the number of reference samples, which is lower in the case of meadows with respect to pastures (see Table
After the training phase and the assessment over point measurements, the SVR algorithm was tested over the distributed dataset available, that is, the RADARSAT2 images acquired in June and July over the Mazia valley. The two images were provided in input to the trained SVR with in addition ancillary data according to the input features configuration considered for the training of the algorithm. The results of this processing step are two maps representing the estimated dielectric constant values over the area of interest and are shown in Figure
Maps of the dielectric constant of the east side of the Mazia valley: (a) 3rd June 2010 and (b) 21st July 2010. The small squares represent a zoom over particular areas extracted from the maps of June and July (indicated with the white square).
From a qualitative viewpoint, the maps reproduce well the expected trend of soil moisture content, presenting high values near to the valley floor (where the irrigated meadows are located) and progressively decreasing values moving to the pastures at higher altitudes. At the same time, the humidity patterns are well recognized, as for example, in the case of the small rivers going down to the valley floor along the side shown in the details of the maps (Figures
A comparison between the map of June and that of July indicates that the soil in the second date presents a drier behavior, especially in the lower part of the valley side, as can be observed in the details shown in Figure
In this paper, polarimetric RADARSAT2 SAR images are exploited for the estimation of soil moisture content in an alpine catchment. We first carried out a sensitivity analysis with the help of field measurements of the target parameter and ancillary data. This analysis pointed out that both topography and vegetation/land-cover heterogeneity strongly affect the backscattering signal acquired over alpine areas, introducing a significant variability and ambiguity in the data. The altitude, the local incidence angle, and the NDVI revealed to be useful features to explain the high level of variability intrinsic in the SAR data.
The following step was the development of a technique for the estimation of soil moisture content from the RADARSAT2 images. We opted for an algorithm based on the
Future development of this work regards first of all a better characterization of the effect of vegetation/land-cover heterogeneity on the SAR signal. This will be carried out with the help of high geometrical resolution data. In particular, the effect of rocks and stones on the microwave signal in relationship to the retrieval of soil parameters will be analyzed. A second interesting development is the exploitation of the polarimetric capability of the RADARSAT2 sensor by means of polarimetric decompositions of the signal, in order to improve the feature extraction/selection process and thus the retrieval of soil parameters. Moreover, an extended validation of the algorithm, by exploiting the measurements provided by the field stations in the Mazia valley and further RADARSAT2 SAR acquisitions over the whole Alto Adige area will be considered. Finally, the availability of high resolution spatially distributed surface soil moisture maps coming from the RADARSAT2 sensor can represent a major improvement for the validation of distributed hydrological models.