^{1,2}

^{3}

^{1}

^{2}

^{3}

We modelled the distributions of two toads (

Comparative distribution modelling (i.e., building models that combine or compare the distributions of different species) is a useful tool to assess differences and similarities between species' distribution areas and environmental correlates. It has been applied, for example, to species with partially overlapping distributions [

Comparative modelling has mostly been done in pairs, by regressing presences of one taxon against presences of the other [

Relatively recent developments in distribution modelling [

In this paper, we test fuzzy logic operations as a tool in comparative modelling using two amphibians with partially overlapping distributions, the common toad (

The study area was the Iberian Peninsula, at the south-western edge of Europe (Figure ^{2} heterogeneous region comprising the mainland territories of Portugal and Spain and linked to the continent by a narrow and mountainous isthmus. It thus constitutes a discrete biogeographical unit appropriate for studies on species distributions [

Location of the study area, recorded distributions (black dots: presences on UTM

Species distribution data, consisting of presences and absences on Universal Transverse Mercator (UTM)

The UTM

Factors and their related variables used to model the distributions of ^{(1)}U. S. Geological Survey (1996); ^{(2)}Font (1983, 2000); ^{(3)}I.G.N. (1999); data on the number of inhabitants of urban centres taken from Enciclopédia Universal (

Factor | Variable | Code |
---|---|---|

Topography | Mean altitude (m)^{(1)} | |

Mean slope (degrees) (calculated from | ||

Water availability | Mean annual precipitation (mm)^{(2)} | |

Mean relative air humidity in January at 07:00 hours (%)^{(2)} | ||

Mean relative air humidity in July at 07:00 hours (%)^{(2)} | ||

Environmental energy | Mean annual insolation (hours/year)^{(2)} | |

Mean annual solar radiation (kwh/m^{2}/day)^{(2)} | ||

Mean temperature in January (°C)^{(2)} | ||

Mean temperature in July (°C)^{(2)} | ||

Mean annual temperature (°C)^{(2)} | ||

Mean annual number of frost days (min. temperature ≤ 0°C)^{(2)} | ||

Mean annual potential evapotranspiration (mm)^{(2)} | ||

Productivity | Mean annual actual evapotranspiration (mm) (=min [ | |

Environmental disturbance | Maximum precipitation in 24 hours (mm)^{(2)} | |

Relative maximum precipitation (= | ||

Climatic variability | Mean annual number of days with precipitation ≥ 0,1 mm^{(2)} | |

Annual temperature range (°C) (= | ||

Annual relative air humidity range (%) (= | ||

Human activity | Distance to a highway (km)^{(3)} | |

Distance to a town with more than 100,000 inhabitants (km)^{(3)} | ||

Distance to a town with more than 500,000 inhabitants (km)^{(3)} |

We built generalized linear models with a binomial distribution and the logit link of the favourability function [

To avoid a spurious effect of surface area on the probability of the species being present, only complete UTM cells, and not those that are cut by the study area borders or the unions between UTM zones, were used for the inductive stage of the modelling. Models were then applied to the whole study area [

Variables were included in the models using a forward-backward stepwise procedure [

a favourability model for

a favourability model for

a favourability model for the occurrence of both species together, where 1 = presence of both and 0 = absence of at least one of them,

a model of favourability for either of the two species, where 1 = presence of at least one and 0 = absence of both species,

a model of favourability for the presence of

a model of favourability for the presence of

Models C1 to F1 were compared, respectively, with their fuzzy logic counterparts from C2 to F2, resulting from the following operations between models A and B:

fuzzy intersection between the individual models (logic “A and B”),

fuzzy union of the individual models (logic “A or B”),

fuzzy intersection between model A and the complementary of model B (logic “A and (not B)”),

fuzzy intersection between model B and the complementary of model A (logic “B and (not A)”).

Note that models E1 and F1, which use presence-only data, are bound to be the same with contrary signs of the variables’ coefficients, but their counterparts E2 and F2 will probably be different. This is why we built both models.

The capacity of each model to discriminate between the modelled events (i.e., presence versus absence or presence of one species versus presence of the other) was assessed with the Area Under the receiver operating characteristic (ROC) Curve (AUC). This is a widely used model evaluation measure that provides a single-number discrimination measure across all possible classification thresholds for each model, thus avoiding the subjective selection of one threshold [

We also compared the favourability values predicted by the models of combined species data and the corresponding fuzzy logic operations between individual species models, using two different measures: Spearman’s nonparametric rank correlation between favourability values, with Dutilleul’s [

There were 3554 presences of

Number of analysed presences and absences and measures of the overall similarity between the predictions produced by modelling combined species distribution data and by fuzzy logic operations between individual species models. For model abbreviations, please see Section

Model comparison | Spearman’s correlation | Fuzzy numerical comparison | ||
---|---|---|---|---|

C1 versus C2 (favourability for presence of both) | 2412 | 3052 | 0.873 | 0.830 |

D1 versus D2 (favourability for presence of any) | 4273 | 1191 | 0.840 | 0.855 |

E1 versus E2 (favourability for | 1142 | 719 | 0.788 | 0.724 |

F1 versus F2 (favourability for | 719 | 1142 | 0.861 | 0.676 |

Comparison of predicted environmental favourability for

The individual models obtained for

The

Top row: Comparison of the receiver operating characteristic (ROC) curves and the areas under them (AUC) for models of combined species data and the corresponding fuzzy logic operations between individual species models. Middle row: Scatter plots and linear regression lines comparing favourability values given by combined models and those given by fuzzy logic operations between individual species models. Bottom row: Box plots showing median, upper, and lower quartiles, and extreme values for favourability given by combination models and the corresponding fuzzy operations.

The predicted values derived from modelling combined species distribution data were also generally similar to the results of fuzzy logic operations between the two single-species models (Figure

The relatively low AUC values obtained for both

Models confronting the presence of

They avoid the need to build additional models: the single-species models are enough.

They allow using all distribution data available, that is, all the localities in the study area, and not only those with exclusive presence of one of the species. This increase in sample size allows a better model calibration and thus can enhance the predictive power of the models.

They allow the possibility of simultaneous multispecies comparisons, instead of comparing species only two by two; models such as C1 may be impracticable when applied to many species, as the number of localities where all the species have been recorded decreases with the number of species analysed, whereas models such a C2 are not affected by this.

Modelling the presence of any of two species (as in model D1 in our study) gives greater weight to the species with higher number of presences, while combining individual species models with fuzzy logic gives the same importance to all species involved.

Our results showed that favourability models for two species combined by means of fuzzy logic operations perform similarly to models of combined data for these species. Although we have not tested this specifically, we may assume that the method will work in other situations, differing, for example, in number of species, the magnitude of the differences between their distribution areas, species prevalence, or the geographical extent of the study area. The modelling method, however, should provide directly comparable numerical predictions, as is the case with the favourability function [

A fuzzy classification technique (fuzzy envelope model, FEM) has been applied [

Favourability values are here considered as the degree of membership to the fuzzy set of localities favourable to the analysed event (presence of one species, of any of them, of both together, and of one instead of the other). Degrees of membership are sometimes confused with probability values, in part because both take values between 0 and 1. However, the conceptual consequences of this difference between degree of membership and probability are relevant. Local favourability denotes a measure of the degree to which local conditions cause local probability to differ from the probability expected at random, that is, from that expected according to the prevalence of the event [

The mathematical consequences of this difference between degree of membership and probability are also relevant. The probability of simultaneous occurrence of several events is calculated by multiplying the individual probabilities of each event, which inevitably yields increasingly lower output values as more events are taken into account. The use of fuzzy logic operations avoids this mathematical problem, as favourability for the simultaneous occurrence of several events is computed as the favourability for the least favourable event [

Neftalí Sillero merged and kindly shared the species distribution data from Portugal and Spain. Christoph Scherber adapted and kindly shared the script for AICc-based model selection. A. M. Barbosa is supported by a postdoctoral fellowship (SFRH/BPD/40387/2007) from Fundação para a Ciência e a Tecnologia (Portugal), cofinanced by the European Social Fund. The “Rui Nabeiro” Biodiversity Chair is financed by Delta Cafés and an FCT project (PTDC/AAC-AMB/098163/2008).