Ambient Ozone Exposure in Czech Forests: A GIS-Based Approach to Spatial Distribution Assessment

Ambient ozone (O3) is an important phytotoxic pollutant, and detailed knowledge of its spatial distribution is becoming increasingly important. The aim of the paper is to compare different spatial interpolation techniques and to recommend the best approach for producing a reliable map for O3 with respect to its phytotoxic potential. For evaluation we used real-time ambient O3 concentrations measured by UV absorbance from 24 Czech rural sites in the 2007 and 2008 vegetation seasons. We considered eleven approaches for spatial interpolation used for the development of maps for mean vegetation season O3 concentrations and the AOT40F exposure index for forests. The uncertainty of maps was assessed by cross-validation analysis. The root mean square error (RMSE) of the map was used as a criterion. Our results indicate that the optimal interpolation approach is linear regression of O3 data and altitude with subsequent interpolation of its residuals by ordinary kriging. The relative uncertainty of the map of O3 mean for the vegetation season is less than 10%, using the optimal method as for both explored years, and this is a very acceptable value. In the case of AOT40F, however, the relative uncertainty of the map is notably worse, reaching nearly 20% in both examined years.


Introduction
Ambient ozone (O 3 ) is a widely studied air pollutant [1]. Due to its unsaturated chemical structure, it is highly reactive and contributes to the oxidative power of atmosphere, essential for scavenging many pollutants from the atmosphere [2]. It has important negative impacts on both human health and the environment as acknowledged in numerous studies [3]. Moreover, due to its absorption-radiation abilities, O 3 is an important greenhouse gas [4,5], and there are significant interactions between O 3 and climate change [6]. Current ambient O 3 levels have increased by approximately two times as compared to those measured over a century ago [7]. Although the magnitude and origin of the hemispheric O 3 trends are still not completely understood [8], there are indications that background O 3 levels over the midlatitudes of the Northern Hemisphere have continued to rise over the past three decades within the range of approximately 0.5-2% per year [9]. A significant contribution to O 3 levels both in Europe and North America originates in East Asia as a result of its dynamic development regarding population growth and increased fossil fuel consumption [10].
For the above reasons, the detailed knowledge of spatial distribution of ambient O 3 levels is becoming increasingly important. To develop a reliable, accurate, and continuous air pollutant surface predicting the values at locations without measurements is an essential task which we frequently encounter in environmental and health-related studies. This benefit of O 3 mapping stands out when viewed alongside the increasingly limited financial resources available for costly ambient air quality monitoring networks.
There are a wide range of techniques available for spatial interpolation, the advantages and limitations of which are widely discussed in the scientific literature [11]. In principle these techniques are classified as deterministic (the nearest neighbor and polynomial regression) or stochastic (geostatistical approaches as kriging and cokriging). The difference between these two is that the geostatistical methods use the 2 The Scientific World Journal spatial correlation structure and allow a prediction variability estimate to assess, under certain conditions, prediction accuracy. In between the deterministic and stochastic methods, there are a wide range of radial basis functions or splines.
The quality of maps of air pollution depends mainly on the quality of the input data measured at the stations, on the number of measuring sites, and also on their spatial distribution [12,13]. Air pollution measurements, particularly those from online permanent monitoring used in routine monitoring networks, are very costly and so the number of sites is generally very limited. The number of required sites depends obviously on the type of air pollutant and on the representativeness of the measuring site. The representativeness is closely related to proximity of emission sources and topography; more stations are needed in complex terrain in contrast to flat [14]. When developing a surface for pollutants with high spatial variability (due to the importance of their local emission sources), for example, PM, benzo(a)pyrene, or toxic metals, more sites are needed. In contrast, pollutants like O 3 , with a more regional character, depending mostly on regional phenomena such as meteorology and long-range or regional air pollution transport need a less-dense monitoring network.
Maps of ambient O 3 , in context of its impacts on environment in rural areas, produced by different approaches were published for different regions: United Kingdom [15], Sierra Nevada in California [16][17][18], and the Carpathians in Europe [19]. Across the EU, mapping of background O 3 at a fine spatial scale (1 × 1 km) was carried out [20], as well as mapping of exposure index AOT40 at 2 × 2 km grid resolution [21,22]. Across the Czech Republic, the method of [15] was applied for ozone deposition mapping [23].
In the Czech Republic (CR), ambient O 3 levels are elevated [24,25], limit values over vast regions are frequently exceeded, and phytotoxic potential is high [26][27][28]. The aim of this paper is to compare the different spatial interpolation techniques and to recommend the optimal approach for producing a reliable map of O 3 with respect to its phytotoxic potential.

Ambient Ozone Data.
We used real-time O 3 levels measured at sites in the nationwide air quality monitoring network by UV absorbance, a reference method as declared by the EC [29]. The ozone analyzers used were the Thermo Environmental Instruments (TEI), M49. The samplers were placed ca 2 m above ground. Standard procedures for quality control and quality assurance [29] were applied. We considered O 3 seasonal means (April-September) and the exposure index AOT40 for forests-AOT40F [30], calculated according to [31]. The input data were 1 h mean concentrations. Data capture required for calculations of seasonal means was 75%. The AOT40 index as a cumulative variable is very sensitive to the quality of measured data [32,33] and obviously also to missing values. More stringent requirements are, thus, needed for calculation of AOT40 as compared to the seasonal mean. In cases when we had less than 90% of hourly O 3 concentrations for the period of April-September for calculating the AOT40F, we corrected it according to [34] so as to prevent underestimation of the O 3 exposure.
With respect to the aim of this study to assess O 3 exposure for forests, urban sites were not considered. From a total of 55 sites monitoring real-time O 3 concentrations across the CR, we accounted only for rural sites as specified by EoI [35]. The rural sites according to EoI are the sites with no important emission sources nearby and which are assumed to be affected only by long-range or regional air pollution transport. Thus, the representativeness of such sites is considerable, mostly in tens to hundreds of kms. The selection of sites resulted in 24 rural sites distributed unevenly across the CR (Figure 1), with more sites at border mountain areas as compared to the interior of the country. Considering that the area of the CR is 79 000 km 2 , the sampling density was approximately one monitor per 3 292 km 2 .

Spatial Interpolation.
Maps for mean vegetation season O 3 concentrations and for exposure index AOT40F for forests were prepared. For spatial interpolation, we used 24 rural sites run by the CHMI. In addition to the Czech sites, we also used data from selected measuring sites in Germany and Poland to improve the interpolation near border areas ( Figure 1). Measuring sites in Slovakia are too distant so they cannot be used. Data from Austrian sites situated near the border were not available. The maps were prepared using ArcGIS Geostatistical Analyst [36] on a grid of 1 × 1 km resolution. A 25 m resolution DEM was used.
For spatial interpolation, eleven methods were used (Table 1). These methods are adequately referenced in scientific literature, so we comment on them only very briefly. In principle, we use three different interpolation methods using measured data only and two basic methods which combine measured and supplementary data, with four subvariants.

Interpolation Methods Using Measured Data Only
(A) Inverse Distance Weighted Method. The inverse distance weighted method (IDW) is likely to be the most frequently used deterministic method [37]. For interpolation we used where Z(s 0 ) is the interpolated value of concentration in the point s 0 , Z(s i ) is the measured value of concentration in the ith point, i = 1, . . . , n, d 0i is the distance between the interpolated point and the ith point with measurement, and n is the number of sites used for interpolation.
(B) Radial Basis Functions Method. Radial basis functions (RBF) interpolate the measured value while minimizing the total curvature of the surface. The interpolation is described by The Scientific World Journal  where Φ(x) is a specific RBF function, d 0i is the distance between the interpolated point and the ith point with measurement, w 1 , . . . , w n+1 are the weighting parameters, and n is the number of surrounding sites used for interpolation. Although calculation of the RBF and estimation of its parameters is rather complicated, the computation is simple and fast. The parameters w 1 , . . . , w n+1 are obtained from the system of equations given by 4 The Scientific World Journal A more detailed description is given by [36]. Coyle et al., 2002, [15] applied this interpolation technique within his approach for ambient O 3 mapping for Great Britain.
(C) Ordinary Kriging. Ordinary kriging is a geostatistical interpolation method. It considers the statistical model: where μ represents the constant mean structure of the concentration field, ε(s) is a smooth variation plus measurement error (both zero-mean), and D is the examined area.
The interpolation is performed according to the equation where Z(s 0 ) is the interpolated value of concentration in the point s 0 , Z(s i ) is the measured value of concentration in the ith point, i = 1, . . . , n, n is the number of surrounding sites used for interpolation, and λ 1 , . . ., λ n are the weights assumed based on a semivariogram. The weights λ i are derived from a semivariogram γ(·) in order to minimize the mean square error. The explicit calculation is achieved by the system of equations given by Kriging is a commonly used standard method. For construction of an O 3 surface, it was used, for example, by [38][39][40].

Interpolation Methods Using Both Measured and
Auxiliary Data. The methods described in Section 2.2.1 were used only for the interpolation of the measured O 3 concentrations. Apart from these methods, there are others using wellcorrelated physical relationships between concentrations and other characteristics, for which more complex spatial information is known. The simplest approaches are the linear regression models without spatial interpolation; more complicated are different combinations of linear regression and spatial interpolation.
where X i (s) are different supplementary parameters at the point s, for i = 1, 2, . . . , c, and a 1 , a 2 , . . ., are the parameters of the linear regression model. In our case, altitude is used as the auxiliary parameter.
(B) Linear Regression Model Followed by the Spatial Interpolation of Residuals. The interpolation is estimated according to where Z(s 0 ) is the estimated value of the O 3 concentration at the point s 0 , X 1 (s 0 ), X 2 (s 0 ), . . ., X n (s 0 ) are the n number of individual auxiliary variables at the point s 0 , c, a 1 , a 2 , . . ., a n are the n selected parameters of the linear regression model calculated at the points of measurement, and η(s) is the spatial interpolation of the residuals of the linear regression model at the points of measurement. The output of a dispersion model, altitude, meteorological variables (temperature, relative humidity, global radiation) may be among the auxiliary characteristics. For preparing an O 3 surface, this method was used, for example, by Loibl et al., 1994 [41], who used relative altitude as the auxiliary characteristics, Horálek et al., 2008 [42], who used model EMEP, altitude, and global radiation as the auxiliary characteristics, and Abraham and Comrie, 2004 [43].
In this study we used the altitude as the sole auxiliary characteristic. The major reason was that preliminary analysis of our data showed the best association between O 3 concentrations and altitude. Inclusion of meteorological variables did not bring any further benefit to our model. The assumption of linear distribution of O 3 with altitude was tested prior to the regression analysis.
Different interpolation methods, as described in Section 2.2.1, can be used for interpolation of residuals.

(C) Interpolation of Mean Afternoon Concentration Minus Regression of Mean Afternoon Increment with Altitude.
Ozone concentrations show diurnal variation. Next to this, mean afternoon increment (i.e., the difference between the mean afternoon concentration and the mean whole-day concentration) is strongly related to altitude; see [15]. Coyle introduced the mapping method in which this regression relation is combined with the spatial interpolation of the afternoon concentration; that is, where ρ(s 0 ) is the spatial interpolation of the mean afternoon concentrations at the point s 0 and R Δ (s 0 ) is the regression relation of the increment Δ based on altitude at the point s 0 . While Coyle uses minimum curvature function (i.e., one of the RBF function) for the interpolation of afternoon values, we use ordinary kriging, as it shows generally better results; see bellow.

(D) Interpolation of Mean Afternoon Concentration Minus Regression of Mean Afternoon Increment with Altitude Followed by the Spatial Interpolation of Residuals.
The variant of the method C is the addition of the interpolation of its residuals to the results of this method. As for the method B, different interpolation methods as introduced under Section 2.2.1 are used.

Uncertainty of Maps.
We used cross-validation analysis for the assessment of uncertainty of the map. Crossvalidation compares a value measured at a monitoring site with an estimated value based on interpolation of values measured at other sites. The root mean square error (RMSE) The Scientific World Journal 5  of the map, calculated according to (10), was used as the uncertainty criterion. RMSE should be as small as possible: where RMSE is the root mean square error of the whole map, Z(s i ) is the measured concentration at the ith site, i = 1, . . ., N, Z(s i ) is the concentration at the ith site estimated from concentrations measured at other sites, i = 1, . . ., N, and N is the number of measuring sites.

Results
The  measured data and altitude with subsequent interpolation of its residuals. This held both for O 3 seasonal means and AOT40F but was more pronounced for the O 3 seasonal means.
The relative uncertainty of the map of mean O 3 for the vegetation season was 8% for LR+res OK and ALR+res OK methods for both explored years. This is a thoroughly acceptable value. Even though the IDW method ranking indicated that it was the worst interpolation approach, the relative uncertainty of the map of mean O 3 for the vegetation season was 13%. In the case of AOT40F, however, the relative   uncertainty of the map was notably worse. For LR+res OK, ranking as the best approach, its relative uncertainty values were 18% in 2007 and 19% in 2008, while for IDW, assessed as the worst approach, its values were 21% for 2007 and 20% for 2008. Figures 2 and 3 show the spatial interpolation of mean O 3 concentrations in the 2008 vegetation season prepared by interpolation techniques LR+res OK and ALR+res OK which ranked best in the comparison ( Table 2). The two approaches resulted in maps which are very similar and exhibited only minor differences. The same applies for Figures     the comparison ( Table 3). The relationships between the results of these two interpolation approaches are presented by scatter plots (Figure 6). Notably better results were obtained for O 3 seasonal means.

Discussion
To produce a reliable air pollutant map to predict values in regions without measurements is an essential yet challenging 8 The Scientific World Journal task. Generally, dense monitoring networks are expensive but give a precise picture of spatial variability of a given phenomenon. Sparse sampling and monitoring networks, however, although less expensive, may miss significant spatial features of the studied phenomenon [12]. To make a real monitoring network denser, it is possible, for example, to add virtual measuring sites to improve the quality of interpolation [44]. Currently, however, no rigorous methodology for the determination of the number of monitoring sites sufficient/adequate to develop a reliable air pollutant surface is available [45]. Familiarity with the terrain and various phenomena that could affect the air pollutant concentrations and distributions are among the most important issues [12].
For mapping purposes, a number of techniques are available. There are substantial qualitative differences between the maps derived using different interpolation techniques as shown, for instance, by [46] for the maps of NO 2 . The assessment of performance of the different techniques is extremely important. The maps derived by different interpolation techniques may be compared and evaluated by using the objective criteria, such as cross-validation [42]when we omit one site in interpolation process and predict its values based on the rest of the sites and then compare the predicted and measured values.
Presented maps are applicable merely for rural areas. The obvious limitation of the maps is the low number of measuring sites which are unevenly distributed across the country. The spatial distribution of sites has strong historic connotations. Originally the measuring sites were located preferentially to more polluted regions, and they still remain in their original setting to observe the long-time trends.
Our results show that using the auxiliary data, in our case the dependence of O 3 concentrations on altitude in particular, significantly improves the overall quality of the resulting map (see Tables 2 and 3). Meteorology was not factored into the linear regression models as the preliminary analysis of our data showed that including meteorological variables did not bring any further benefit to our model. The likely reason is fairly low number of O 3 measuring sites. In near future we intend to use the Eulerian photochemical dispersion model CAMx [47] as auxiliary characteristics. The preliminary results seem to be promising. Next to this, it can be stated that the methods using ordinary kriging in its spatial interpolation part show the best results.
If we compare the two methods which ranked as the best, the LR+res OK (i.e., linear regression of measured values with altitude followed by the interpolation of its residuals using ordinary kriging) approach slightly outperforms ALR+res OK (i.e., the Coyle's approach [15] followed by the interpolation of its residuals using ordinary kriging) or gives comparable results. ALR+res OK, however, is much more complicated and demanding for computation and, thus, less practical for application.
We can reasonably assume that notably worse results of mapping of AOT40F as compared to mean O 3 concentration for a vegetation season (see Figure 6) are due to more random/incidental factors affecting AOT40. Moreover, exposure index AOT40 is less robust characteristic as compared to mean O 3 concentrations [33].
Presented maps show the high-resolution O 3 spatial patterns which can be used for assessment of O 3 effects on vegetation. Exposure maps in particular are useful for indication of areas with high O 3 phytotoxic potential for forests and were already used across the Czech forests earlier [26]. Spatial patterns of O 3 seasonal means are useful for estimation of O 3 deposition as published for the Czech coniferous and deciduous forests by [23] and for estimation of stomatal flux.

Conclusions
We developed reasonable continuous surfaces for ambient O 3 vegetation season mean concentrations and AOT40F using eleven interpolation approaches. The comparison based on RMSE indicates that linear regression between measured O 3 data and altitude with subsequent interpolation of its residuals outperforms the interpolation techniques IDW, radial basis functions, and ordinary kriging significantly. This holds for both O 3 seasonal means and AOT40F, and, in the case of O 3 seasonal means, this feature is more pronounced as compared to AOT40F. Considering all different aspects, including the results of cross-validation analysis and the demandingness of computation, linear regression of O 3 data and altitude with subsequent interpolation of its residuals by ordinary kriging can be recommended as the optimal approach out of the eleven spatial interpolation techniques examined. Notably better results in mapping were obtained for mean seasonal O 3 concentrations in comparison to exposure index AOT40F.