Potential Distribution of Chagas Disease Vectors (Hemiptera, Reduviidae, Triatominae) in Colombia, Based on Ecological Niche Modeling

Ecological niche modeling of Triatominae bugs allow us to establish the local risk of transmission of the parasite Trypanosoma cruzi, which causes Chagas disease. This information could help to guide health authority recommendations on infection monitoring, prevention, and control. In this study, we estimated the geographic distribution of triatomine species in Colombia and identified the relationship between landscape structure and climatic factors influencing their occurrence. A total of 2451 records of 4 triatomine species (Panstrongylus geniculatus, Rhodnius pallescens, R. prolixus, and Triatoma maculata) were analyzed. The variables that provided more information to explain the ecologic niche of these vectors were related to precipitation, altitude, and temperature. We found that the species with the broadest potential geographic distribution were P. geniculatus, R. pallescens, and R. prolixus. In general, the models predicted the highest occurrence probability of these vectors in the eastern slope of the Eastern Cordillera, the southern region of the Magdalena valley, and the Sierra Nevada of Santa Marta.


Introduction
Triatomines (Hemiptera, Reduviidae) are hematophagous insects that are considered important vectors of Trypanosoma cruzi. They are widely distributed in America with rare occurrences in Eastern Asia, though they are primarily found in the Neotropical region [1]. Disease transmission occurs from southern United States to Argentina, with approximately 25% of the human population at risk [1,2].
Triatomines have a wide range of climatic and ecological tolerability because they inhabit diverse ecosystems, from humid to dry forests across the Americas [2]. They are found at different altitudes and have high vagility, which allows them to exploit diverse food resources and to find different resting sites, including intra-and peridomiciliary habitats [3][4][5][6]. Therefore, at the species level, habitat preferences are influenced by environmental factors, which can strongly drive their distribution [7][8][9]. Temporal and spatial changes in temperature, precipitation, and humidity affect their biology and ecology, which can alter the risk of transmitting T. cruzi [10]. These insects appear to be able to move between microclimates within their habitats while seeking the most favorable conditions [11]. Local environmental determinants can be described as a function of altitude, climate, vegetation type, and land use [11]. For some species of Triatominae, vegetation type appears to be an important predictor of abundance [12]. Likewise, higher Triatominae abundance appears to be associated with perturbed vegetation (agriculture and pasture) [13]. This suggests that deforestation and habitat degradation are important factors contributing to the habitat of some species of triatomine.
Remote sensing, geographic information systems (GIS) and their use for modeling species distribution, have allowed identifying high infestation areas, data simplification, and 2 Journal of Tropical Medicine control efforts. Ecological niche models are empirical or mathematical estimates of species ecological niche in terms of suitable habitat conditions [14,15]. They permit the association of different ecogeographical variables (e.g., environmental, topographical, and anthropogenic) with the species distribution, by identifying the variables that constrain and define the particular niche. The modeling outcome can be a spatial representation of the favorable habitat for a species occurrence. Therefore, suitability maps for the species, as function of their ecological niche, can be created [16].
Ecological niche models have been used to build potential distribution models for 18 Triatominae species in Brazil [17][18][19], Triatoma infestans in Argentina [12], T. dimidiata in México [13], and Rhodnius pallescens, R. prolixus, T. dimidiata, and Panstrongylus geniculatus (unpublished data) in Colombia [12,[20][21][22]. Although there have been some advances in the understanding of the distribution patterns of these vectors, it is necessary to increase our knowledge to refine and improve the use of remotely sensed environmental variables. This will help to describe and predict the distribution of the triatomine species of epidemiological importance because they invade houses and transmit T. cruzi.
The Chagas Disease Control Program in Colombia (CDCP) [23] manages vector control programs through entomological surveys in the 20 departments where the disease is endemic. This program allows the identification of transmission risk zones according to entomological indices. However, CDCP does not use spatial technologies like GPS, GIS, and spatial analysis in the endemic departments [24]. The main species transmitting T. cruzi in Colombia are R. prolixus and T. dimidiata, which present a very complex epidemiological cycle that involves domestic, peridomestic, and sylvatic habitats [25,26]. However, other species such as P. geniculatus, R. pallescens, T. maculata, and T. venosa play an important role as secondary vectors of the parasite. These species have greatest association peridomestics and wild habitats and may in the future play an important role in the transmission of the disease. Therefore, monitoring of these species should be included in surveillance programs [25,26].
In this study, we explore the relationship between the geographical distribution of P. geniculatus, R. pallescens, R. prolixus, and T. maculata and different bioclimatic variables, in order to obtain the potential distribution model for each species. We also describe the bioclimatic variables and landscape attributes that best explain the potential distribution of each species in Colombia.

Georeferencing Errors.
A descriptive analysis for each georeferenced location was performed in the R project v.3.0.3 [27] with the sp package [28] through the "buffer" tool. This analysis described the variation of each bioclimatic variable around a collecting site and allowed us to identify the sensitivity per grid cell to the georeferencing errors. Since climatichomogeneous areas would not have biases, an influence area of 10 km of radius was generated for each occurrence, and descriptive statistical parameters, such as median and minmax range, were calculated. In addition, linear regression for the occurrence data versus each variable was performed. To avoid influential outliers, we excluded 2.5% of the distribution in each tail from the residual distribution of the linear model. A total of 1281 records were used for the models: P. geniculatus ( = 422), R. pallescens ( = 153), R. prolixus ( = 559), and T. maculata ( = 147).

Avoiding Redundancy of Variables.
As bioclimatic variables can be correlated with each other [30], a cluster analysis was performed to avoid collinearity and overestimation in the models. The Pearson value threshold was chosen to be < 0.7 (Table 1).
Temperature seasonality -X --Bio 5 Max temperature of warmest month X X --Bio 6 Min temperature of coldest month X ---Bio 7 Temperature annual range X ---Bio 8 Mean temperature of wettest quarter -X X -Bio 9 Mean temperature of driest quarter X ---Bio 10 Mean temperature of warmest quarter X ---Bio 11 Mean temperature of coldest quarter ---X Bio 13 Precipitation of wettest month X ---Bio 14 Precipitation of driest month ---X Bio 15 Precipitation seasonality X X X X Bio 17 Precipitation of driest quarter --X -Bio 18 Precipitation of warmest quarter --X X Bio 19 Precipitation of coldest quarter -X X X

Species Distribution Models and Model Validation. The
Maximum Entropy Algorithm, MaxEnt [16], was performed on dismo [31] and ENMeval [32] packages in R. ENMeval allow for automatically partition data into evaluation bins. The potential species distribution maps were obtained applying the default parameters and varying the regularization value from 0.25 to 2 (each 0.25), for a total of 8 analyses.
We compared models based on Akaike Information Criterion (AIC), where the lower the AIC, the better the model fit [33].
The occurrence data for each species were separated into two sets. 50% of the data points were randomly selected as training points and used for model formulation and selection. The remaining 50% of the records were retained as test points. The MaxEnt model output format was set to logistic. In addition, binary maps were created using the 10th percentile training presence logistic threshold to separate the map into binary predictions. To determine which of the variables contributed more to explain the triatomine distributions, we used MaxEnt's internal Jackknife test of variable importance.
The model's performance was evaluated with the Receiver Operating Characteristic (ROC) curve and the Area Under the curve (AUC), as an estimate of the model performance across these thresholds [30]. A total of eight maps were obtained for each species; to choose the best value we plotted the AUC values versus the value (Table 2) and then the models were run (five replicates) with the regularization value of 0.75, since the models behavior did not change significantly with higher values of .
2.6. Landscape Matrix. The suitable zones of potential occurrence were to describe the landscape matrix (mixture of different types of ecosystems) in terms of transformed ecosystems (i.e., due to anthropogenic activities) and natural ecosystems, based on the ecosystems map of Colombia from Etter [34] (Figure 4).

Results
The collinearity analysis showed that the variables annual mean temperature (bio 1) and annual precipitation (bio 12) were redundant and precipitation of wettest quarter (bio 16) and only altitude and precipitation seasonality (bio 15) contributed to explain the distribution across the four species (Table 1). The remaining variables contributed differently to each species model (Table 1 and Figure 2). For P. geniculatus altitude, precipitation seasonality (bio 15) and isothermality (bio 3) were the variables that most contributed to the model. For R. pallescens the three variables that better described the variation in the distribution of the species were precipitation seasonality (bio 15), temperature seasonality (bio 4), and isothermality (bio 3). In the case of R. prolixus, precipitation seasonality (bio 15) and altitude and mean diurnal range (bio 2) were the most important variables that explained the model. And for T. maculata the variables that better described the distribution were precipitation of driest month (bio 14), precipitation seasonality (bio 15), and altitude.
The models predicted the highest occurrence probability in the eastern slopes of Eastern Cordillera, the southern region of the Magdalena valley, and the Sierra Nevada of Santa Marta. In particular, P. geniculatus showed a high occurrence probability towards the Sierra Nevada of Santa Marta, north of Central Cordillera to the Serranía de San Lucas, the eastern and western slopes of Eastern Cordillera, and the lowlands of serranía de Perijá (mean AUC = 0.834, sd = 0.012; mean Kappa = 0.737, sd = 0.025). In addition, there are few places with high probability in the southernmost of the Western Cordillera and in the Caribbean coast (Figure 3(a)). Rhodnius pallescens showed the widest pattern distribution (mean AUC = 0.906, sd = 0.011; mean Kappa = 0.771, sd = 0.066), with the highest occurrence probability in the Caribbean coast, Magdalena valley, Urabá Gulf, center region of Cauca valley, and lowlands of serranía de Perijá (Figure 3(b)). For R. prolixus the occurrence is more limited (mean AUC = 0.881, sd = 0.014; mean Kappa = 0.701, sd = 0.055), with the highest occurrence probability in the eastern slope of Eastern Cordillera towards the Orinoquía region, the southern region of the Magdalena valley, the southern region of Western and Central Cordilleras, and the Sierra Nevada of Santa Martha (Figure 3(c)). Finally, T. maculata showed the most restricted pattern (mean AUC = 0.928, sd = 0.003; mean Kappa = 0.791, sd = 0.091), with highest occurrence probability in the Caribbean coast towards the Guajira peninsula, the eastern slope of Eastern Cordillera towards the Orinoquía region, and a restricted region of Orinoquía near the Venezuelan border (Figure 3(d)).
In general, we found that the Triatominae species were more likely to be present in transformed ecosystems (

Discussion
We found that P. geniculatus, R. pallescens, and R. prolixus have wider distributional ranges than T. maculata. This observation is based on the number of ecosystems occupied by the species. Although sampling effort and species knowledge largely influence these results, these findings reveal interesting patterns related to species tolerance to disturbed environments. Human activities that negatively impact the natural ecology of triatomine species, such as deforestation, are likely to encourage adaptation to the domestic environment [35,36].
The relationship between climatic factors and the geographical distribution of Chagas's disease vectors has been studied by several authors. The studies of Carbajal de la Fuente et al. [18] and Mischler et al. [37] concluded that Triatominae bugs are very sensitive to small variations in temperature and humidity. Carcavallo and Barreto [38] classified the diversity of the Triatominae species in two well-defined seasons with a dry period and a rainy period. The same author associated high population densities with long seasons of dryness and high temperatures. Zeledón and Rabinovich [4] analyzed the influence of climatic factors on different Triatominae species, such as T. infestans, R. prolixus, and P. megistus. They concluded that the most important climatic determinants of the geographical distribution of these vectors were temperature and relative humidity.
We found that variations in the predicted distribution of P. geniculatus were primarily explained by altitude, precipitation seasonality (bio 15), and isothermality (bio 3). A large proportion of the potential range of this species is associated with transformed ecosystems. In Colombia, according to Guhl et al. [25] P. geniculatus has been reported in 25 departments. The main habitats are burrows and nesting places of marsupials, bats, rodents, and birds, but adult specimens have also been collected from peridomiciles and houses, as these bugs are strongly attracted by artificial lights [39]. This species has been found at altitudes close to 1700 m.a.s.l. (metres above sea level) [40]. Subsequently, it has been speculated that, after an initial adaptation to environments of high humidity in the Amazon, its association with the humid microclimate of armadillo burrows has facilitated its broad geographical distribution beyond the limits of the Amazon [35]. A few other species are also widely distributed, possibly due to their potential to exploit a broader range of habitats or because their ecological niches are widespread. This species is more sensitive to environmental devastation due to its microclimatic requirements. Therefore, in areas of rapid and intense deforestation, the species is less likely to survive compared to where the changes occur on a smaller scale or more slowly [41].
Rhodnius pallescens has been reported in Belize, Nicaragua, Costa Rica, Panama, and Colombia, where it inhabits sylvatic environments and it is often found in human dwellings, although without intradomestic colonies [20]. Colonies of the species are also found in sylvatic ecotopes such as crowns of at least four species of palms: Attalea butyracea, Cocos nucifera, Elaeis oleífera, and Oenocarpus bataua. [42,43]. The palm tree A. butyracea is its primary biotope. These palms, besides serving as shelter for triatomines, are occasional habitats for a great diversity of fauna of mammals, which in turn are a food source for triatomine bugs and reservoirs of the parasite. During the dry season, palms offer optimum humidity and temperature conditions for the development and multiplication of both triatomine and its associated fauna. The leaves are used for thatching houses, a situation that helps create foci of infestation because it promotes the transfer of the Triatominae to the house. The wide potential distribution of R. pallescens can be explained by the distribution of the four species of palms mentioned above. This species has a wide geographical distribution in the country and is considered a potential problem since it is a candidate to replace the domestic R. prolixus when the latter will be eliminated from houses as a consequence of control campaigns [42]. Our model for R. pallescens coincided with previous models for the species [44]. For R. pallescens the predicted areas in the Caribbean plains have the ecosystems of tropical rain forest and tropical dry forests.
Rhodnius prolixus has a restricted geographical potential distribution, confined to the Andean region. Possibly, its spatial range has decreased due the control actions carried out by the secretaries of health of the departments of Boyacá, Cundinamarca, and Santander. However, the obtained predicted distribution agrees with the areas that have historically been most prevalent for Chagas disease in the country. We report new areas of potential distribution of R. prolixus to the Oriental Plains that coincide with industrial plantations of oil palm (Elaeis guineensis). Sylvatic R. prolixus have been reported in these palms [45]. Also we observed a potential distribution to the south of the country which should be studied in more detail.
Our findings coincide in part with previous distributional models generated for R. prolixus in Colombia. These models showed two distinct distributional areas [21]: one geographical zone at the east of Eastern Cordillera, associated with the environmental variables used in the work of Guhl [21], normalized difference vegetation index (NDVI), land surface temperature (LST), middle infrared (MIR), air temperature (TEMP), and altitude, and another zone at Northwestern of the Central Cordillera with low association with these variables. Guhl [21] found little association between environmental variables, such as temperature, altitude, and vegetation cover, northwest of the Central Cordillera. The lack of association was attributed to the passive dispersion of vector populations by human migration. In contrast, other studies have shown a high association between relative humidity, maximum temperature, and species presence [11]. Rhodnius prolixus is the most efficient vector of T. cruzi, due to its biological features, namely, fast reproduction rates, adaptation to the human environment, and high rates of defecation and infection [25]. In this study, we found that this species is potentially distributed across highly populated regions of the country (towards the northern portion of the Eastern Cordillera in Santander and Boyacá departments; Figure 3(c)). Furthermore, it is predicted to be present in the Oriental Plains region (Casanare and Meta departments). These findings are in agreement with the known distribution of R. prolixus in Colombia.
This study is among the first ones that has attempted [20][21][22] to model the distribution of T. maculata in Colombia. The distribution of this species seems to be mainly explained by precipitation variables (Figure 2(d)). Because the highest occurrence probability was towards the Caribbean coast and Oriental Plains, and the variation of seasonality precipitation would not affect the presence of the species, and T. maculata may be considered an emerging vector in the Northern Andean countries (Venezuela and Colombia). In some areas of Venezuela and Colombia, this species has the capacity to colonize human dwellings and may be responsible for Chagas disease transmission [25]. Usually, this species is peridomestic [46], found in outer walls of the houses, chicken coops, dove nests, and sylvatic environments: dead trunks, palm trees Attalea complex, bromeliads, caves, and birds' Colombia. The color scale shows the following: white, zero species; green, one species; light green, two species; brown, three species; and red, four species.
We found a high coincidence of the predicted distribution of T. maculata with transformed and environmentally degraded ecosystems, which in these areas coincide with the domiciliary and peridomiciliary habitats.
As it has been suggested for some Triatominae species (e.g., Romaña et al.) [55], other sources of ecological information, such as the distribution of reservoirs (e.g., American marsupials of the family Didelphidae) and vegetation type (e.g., palms like Attalea butyracea (Mutis ex L. f.) Wess. Boer), should be considered and included within the models. Further, validation of results via field investigations to identify present species should be conducted.
According to the potential distribution of each species, we defined potential sympatric areas in the country, as an attempt to identify areas where health authorities have to target and improve the entomological surveillance of triatomines. Figure 5 shows that the Sierra Nevada of Santa Marta, the Caribbean coast, Catatumbo, and the southern region of Magdalena valley are areas at the confluence of the four species studied (red color).
The lack of species towards the east of the country could be due to the absence or scarcity of records in this area. Even though this zone of Colombia is the least populated area in the country, entomological surveys in these regions could

Conclusion
This study highlights the relationship between environmental factors and P. geniculatus, R. pallescens, R. prolixus, and T. maculata in Colombia and the importance of GIS and modeling tools to improve mapping efforts. These tools should form part of the official prevention and control plans for vector borne diseases. Previously several aspects of the ecoepidemiology of Triatominae species were unknown and this study helps to identify potential geographic regions where these species can thrive. Knowledge of the biology of vectors is central to choose complementary measures to traditional vector control and surveillance strategies. Ultimately, this information contributes to the understanding of the dynamics of transmission of T. cruzi at local, regional, and national levels.

Disclosure
The abstract of this work was presented in "XVI Congreso Colombiano de Parasitología y Medicina Tropical."