Relationships between Passive Sampler and Continuous Ozone (O3) Measurement Data in Ecological Effects Research

In ecological effects research, there is a rapid increase in the application of passive sampling techniques for measuring ambient ozone (O3) concentrations. Passive samplers provide data on cumulative exposures of a plant to a pollutant. However, O3 is not an accumulative contaminant within the plant tissue, and use of prolonged passive sampling durations cannot account for the dynamics of the occurrences of O3 that have a significant influence on the plant response. Therefore, a stochastic Weibull probability model was previously developed and applied to a site in Washington State (1650 m MSL) to simulate the cumulative exposure data from a passive sampler, to mimic the corresponding frequency distributions of hourly O3 concentrations that would otherwise have been obtained by continuous monitoring. At that site the correlation between the actual passive sampler and the continuous monitor data was R2 = 0.74. The simulation of the hourly O3 data was based on and compared with the results obtained from a colocated continuous monitor. In this paper we report the results of the model application to data from an unrelated monitoring site (New Hampshire, 476 m MSL) with poor correlation between the passive sampling and continuous monitoring (R2 = 0.24). In addition, as opposed to the previous work, we provide comparisons of the frequency distributions of the hourly O3 concentrations obtained by the simulation and the actual continuous monitoring. In spite of the major difference in the R2 values, at both sites the simulation provided very satisfactory results within the 95% confidence interval, suggesting its broad applicability. The final objective of this overall approach is to develop a generic model that can simulate reasonably well the occurrences of ambient O3 concentrations that are dependent upon the elevation of the measurement site and the synoptic and local meteorology. Such an effort would extend the relative utility of the passive sampling data in explaining stochastic plant responses.


INTRODUCTION
The use of passive ozone (O 3 ) samplers in ecological effects research is increasing rapidly because they are easy to use and inexpensive, and they do not require electrical power. Passive sampling systems yield cumulative exposures of a receptor to air pollutants, as their total or average concentrations integrated over the sampling period [1]. Although most recently a passive sampling duration of 12 h has been used in a program oriented toward regional-scale air-quality regulation [2], because of logistic and other considerations most plant scientists have employed 1-week [3] or 2week [4] sampling periods. Cumulative exposures derived from prolonged sampling periods cannot account for the dynamics and flux of pollutant occurrences and the corresponding plant responses that are governed by the changing climatic patterns and physiological and growth characteristics of the plant species [5,6]. Different growth or phenomenological stages of a plant can respond differently to the same or varying O 3 exposure patterns. The final biomass response is a product of the stress effect minus the energy expended for repair, compensation, or avoidance [5].
Because of the aforementioned considerations, Krupa et al. [7] developed and verified a stochastic, Weibull probability model to simulate the underlying frequency distributions of hourly O 3 concentrations (exposure dynamics) using the single, weekly mean values obtained from a passive (sodium nitrite absorbent) sampler. The simulation was based on the data derived from a colocated continuous O 3 monitor. The ultimate goal of this approach is to develop a generalized model that includes elevation and main climatic factors to allow improved utility of the passive sampler data, in the absence of continuous O 3 monitoring, in understanding stochastic vegetation responses under ambient conditions.

MODEL DESCRIPTION AND STEPS
There are several steps in the implementation of the model: (2) Based on the published literature [8], the cumulative distribution function of the threeparameter Weibull frequency distribution can be used as the theoretical distribution to fit the data. where c is the shape of the distribution, α is the scale, and ξ 0 is the location of the highest density [9]. (3) In order to fit the Weibull distribution, estimation of its parameters (shape, scale, and location) is performed using the minimization of the Kolmogorov-Smirnov (KS) distribution distance. The KS distribution distance can be computed as [10]: The best estimates of the shape, scale, and location of the Weibull distribution are values of the parameters, which minimize the KS distribution distance. The actual minimization of the KS distance is achieved using a multidimensional random search algorithm optimized by the partial derivatives of KS with respect to the estimated parameters [8].
(4) The main purpose here is to verify the theoretical, statistical methodology regardless of any practical considerations. For this reason the parameters of the best Weibull distributions obtained from continuous monitoring data in step 3 are used to generate synthetic data and their distributions compared with the distribution of the original continuous monitoring data.
(5) After obtaining best estimates of Weibull parameters for data from all passive sampling periods, a cubic regression model of those parameters can be built using the passive monitoring data as the predictor variable. This model can be written as: where x is the passive monitoring data, y is the Weibull distribution parameter being predicted, β' values are the regression parameters, and ε' values are random error terms [11].
(6) Using the regression model from step 5, new Weibull distribution parameters are computed using the passive monitoring data as predictors. Subsequently the Weibull random-numbers generator is used to produce new synthetic data. (7) To test the quality of the results, the cumulative distributions of the Weibull-generated synthetic data obtained in step 6 is compared with the empirical continuous O 3 monitoring distribution functions from step 1 using the KS distance.

RESULTS AND DISCUSSION
Initially the model was developed using colocated passive sampling and continuous O 3 monitoring data collected during the growing season (June through August) of 1996 at Paradise, WA (1650 m MSL) [7]. Table 1 provides a summary of the results obtained in step 7 (comparison of the modeled data with the continuous measurements) of the model described in the previous section.

TABLE 1 Summary Statistics of the Kolmogorov-Smirnov (KS) Test for Goodness of Fit Between Continuously Monitored and Synthetic Data on Hourly O 3 Concentrations (ppb, Weekly)
Note: The synthetic data were generated from the frequency distribution parameters derived from the passive sampler data (Paradise, 1996).
In this particular case, compared to the passive sampler data, the mean values from the continuous monitor (Con) were always higher, ranging from 1.25 to 14.4 ppb or 3.4 to 30.3%. However, the mean values from the synthetic O 3 frequency distribution data (Syn) were only -3.4 to +9.07 ppb or -7.0 to +11.66% (Con-Syn) in variance from the corresponding continuous monitor data (Table 1). This is in comparison to actually observed random sampling bias of the passive sampler (-1.25 to -14.4 ppb or -3.4 to -30.3%, Table 1). The overall results predicted by the Weibull model would have been even better if the variance of the measured passive sampler mean O 3 concentrations had been less than what was observed, compared to the corresponding means from the colocated continuous monitor.
The model was validated by using the 1997 data from the same study site, Paradise [7]. The R 2 between the passive sampler and the continuous O 3 monitor data (Con) at Paradise was 0.72 during 1996 and 0.46 during 1997. Yet in 1997 the mean values from the synthetic O 3 data (Syn) derived from the mean passive sampler measurements were -3.79 to +3.03 ppb or -12.8 to +7.4% (Con-Syn) in variance from the corresponding continuous monitor data. These ranges of values were relatively similar to the original model output (-7.0 to +11.7%, Table 1) derived from the 1996 measurements. Thus, within the limitations of the passive sampler data (low R 2 in 1997), the model performed extremely well in simulating the hourly O 3 concentrations from the colocated continuous monitor [7].
To further test the performance of the model, an independent data set with a very poor correlation between the passive sampler and a colocated continuous monitor was selected (R 2 = 0.24), and thus data collected during July to August 1994 at Mt. Washington, NH (476 m MSL) were analyzed. Fig. 2 provides the Weibull probability plot of the hourly average O 3 concentrations derived from the continuous measurements. The fit of those measured values with the expected was excellent within the 95% confidence interval.
Therefore it might be concluded that the poor R 2 between the passive sampler and the continuous monitoring data was most likely due to the small sample size of the passive sampler data (n = 6). Also turbulent diffusion might have been a major factor at the study site, contributing to the random variance of the passive sampler performance. This hypothesis could not be tested,  (July 19-26, 1994). The horizontal axis represents the ordered statistic of the O3 levels. The solid line in the figure shows the expected, the solid circles show the distribution of the hourly concentrations, and the two dotted lines represent the 95% confidence intervals. The one data point outside the 95% confidence interval is within the 5% of occurrence of outlier values allowed in the test.

TABLE 2 Summary Statistics of the Kolmogorov-Smirnov (KS) Test for Goodness of Fit Between Continuously Monitored and Synthetic Data 1 on Hourly O 3 Concentrations (ppb, Weekly)
Note: The synthetic data were generated from the frequency distribution parameters derived from the passive sampler data (New Hampshire, 1994).
since no wind data were available. Nevertheless, the variance between the mean continuous monitoring and the corresponding passive sampler data ranged from -3.26 to +5.37 ppb or -11.5% to +19.6% (Table 2). Similarly the difference between the mean continuous monitoring and the corresponding passive-sampler-based synthetic data ranged from -4.75 to +3.28 ppb or -21.6 to +13.1% (Table  2). These results, in comparison to the data from Paradise (Table 1), were surprisingly reasonable given the poor correlation between the passive sampler and the continuous monitor data, further substantiating the robustness of the model.
While Table 2 contains summary statistics (weekly mean and SD) of the model output (synthetic data), Table 3 provides the results on the frequencies of the occurrences of hourly O 3 concentrations derived from the same synthetic data, on a weekly basis. With the exception of two extreme values during weeks 4 (late July) and 5 (early August) (Table 4), overall these distributions were relatively similar to those derived from the colocated continuous O 3 monitor (Fig. 3). This was in spite of the poor correlation between the original passive and the corresponding continuous monitoring data (R 2 = 0.24).

TABLE 3 Summary Statistics of the Weibull-Generated Synthetic Hourly Ozone Frequency Distributions
Note: From the New Hampshire 1994 passive sampler data.

TABLE 4 Summary Statistics of the Continuous Hourly Ozone Frequency Distributions
Note: From New Hampshire 1994.

CONCLUSION
It should be possible to improve the model output further by using multivariate statistical techniques that include climate parameters such as temperature, wind speed, relative humidity, and radiation [12,13]. That work is currently in progress. Nevertheless, it might be concluded that the overall approach might negate the initial intent of the ease and desirability of the use of passive samplers. However, as more of the needed data become available, it should be possible to develop a generic model that has wide spatial application. In that context, elevational and spatial variability in the occurrences of ambient O 3 concentrations must be accounted for, in relating such information to ecological effects. The present effort provides an important step beyond the normal kriging techniques (with annual or seasonal average or total O 3 concentrations) being used in mapping the spatial distribution of ambient O 3 exposures [14], and better explains the stochasticity in vegetation response relationships. With the model output, geostatistical techniques can be applied using exposure descriptors such as the weekly median and the percentiles of hourly concentrations in defining the spatial variability in the O 3 exposures and the corresponding differences in plant growth characteristics and responses. Furthermore, such information can be used to extend the model output to derive empirical data on the exchange of O 3 between the atmosphere and the plant canopy [15]. Grünhage et al. [16] have calculated fluxbased (absorbed or effective dose) critical pollutant levels to protect vegetation in Europe against the adverse effects of O 3 . Similar information is lacking in North America at the present time, although a number of investigators are examining that issue. In addition, there is a growing interest in examining diurnal vs. nocturnal O 3 exposure profiles in explaining the observed plant responses [17]. Exposures to pollutant mixtures can result in the maintenance of some degree of stomatal opening at night [18] and consequently the uptake of variable amounts of O 3 that are available at the atmosphere-plant canopy interface. In that context, there are efforts underway to develop a passive O 3 sampler that functions only during the daylight hours as opposed to the entire day (M. Ferm, IVL, Göteborg, Sweden, personal communication). The current model can also address those types of data if the hours of daily sunrise and sunset at the sampling location are known.