Technical Workflow Development for Integrating Drone Surveys and Entomological Sampling to Characterise Aquatic Larval Habitats of Anopheles funestus in Agricultural Landscapes in Côte d'Ivoire

Land-use practices such as agriculture can impact mosquito vector breeding ecology, resulting in changes in disease transmission. The typical breeding habitats of Africa's second most important malaria vector Anopheles funestus are large, semipermanent water bodies, which make them potential candidates for targeted larval source management. This is a technical workflow for the integration of drone surveys and mosquito larval sampling, designed for a case study aiming to characterise An. funestus breeding sites near two villages in an agricultural setting in Côte d'Ivoire. Using satellite remote sensing data, we developed an environmentally and spatially representative sampling frame and conducted paired mosquito larvae and drone mapping surveys from June to August 2021. To categorise the drone imagery, we also developed a land cover classification scheme with classes relative to An. funestus breeding ecology. We sampled 189 potential breeding habitats, of which 119 (63%) were positive for the Anopheles genus and nine (4.8%) were positive for An. funestus. We mapped 30.42 km2 of the region of interest including all water bodies which were sampled for larvae. These data can be used to inform targeted vector control efforts, although its generalisability over a large region is limited by the fine-scale nature of this study area. This paper develops protocols for integrating drone surveys and statistically rigorous entomological sampling, which can be adjusted to collect data on vector breeding habitats in other ecological contexts. Further research using data collected in this study can enable the development of deep-learning algorithms for identifying An. funestus breeding habitats across rural agricultural landscapes in Côte d'Ivoire and the analysis of risk factors for these sites.

large region is limited by the fine-scale nature of this study area. is paper develops protocols for integrating drone surveys and statistically rigorous entomological sampling, which can be adjusted to collect data on vector breeding habitats in other ecological contexts. Further research using data collected in this study can enable the development of deep-learning algorithms for identifying An. funestus breeding habitats across rural agricultural landscapes in Côte d'Ivoire and the analysis of risk factors for these sites.

Introduction
Land-use and land cover changes transform landscapes, with complex effects on mosquito vectors and the transmission of malaria and other vector-borne diseases [1,2]. ese environmental changes can lead to shifts in vector densities, species compositions, host availability, and biting patterns [3]. Despite large reductions in malaria burden globally, these landscape changes threaten to undermine malaria control and elimination efforts and have been identified as a priority for malaria eradication by the World Health Organisation [4]. Land-use practices, such as agriculture, forestry, and urbanisation, can modify physical surfaces and create new malaria vector breeding sites. Identifying where and when land-use practices are likely to increase mosquito vector densities and malaria transmission is a fundamental step towards understanding how landscapes can be sustainably managed to improve public health.
Remote sensing data is becoming increasingly accessible as a means to investigate the impact of land-use practices and other landscape characteristics on different aspects of vector ecology [5]. ese Earth Observation (EO) data have different spatial and temporal resolutions depending on the sensor specifications and frequencies of obtaining usable imagery. In addition, optical EO data are characterised by spectral resolution determined by the number and wavelengths the sensor measures along the electromagnetic spectrum. Traditionally, pixel-based machine learning methods have been used to identify land cover categories of interest from these spectral signatures. However, new sources of Earth Observation data, such as very high-resolution satellite imagery (e.g., Geo-Eye and Planet SkySat) and drone imagery, often have high spatial resolution but much lower spectral resolution. Aerial imagery may also be used to estimate 3-dimensional structures using photogrammetric (structure from motion) methods [6]. While these new data sources have enormous potential for identifying vector habitats, the characteristics of these data limit the utility of pixel-based methods relying solely on spectral characteristics to classify environments. Deep-learning methods, such as convolutional neural networks (CNNs), are revolutionising image analysis. ese methods allow efficient analysis of image textures and spectral characteristics using self-learning artificial intelligence approaches to identify features in complex environments. By enabling the identification of highly discriminative features and patterns, these methods have the potential to detect landscape factors associated with vector habitat productivity, helping to overcome the limitations of lower spectral resolution data. Although computationally intensive, more efficient deep-learning architectures and cloud-based computing are increasingly accessible and have been applied for ecological analysis (e.g., [7][8][9]).
Mosquitoes have four main life stages. e first three juvenile stages, egg, larva, and pupa, are aquatic, with the adult stage being terrestrial. When aiming to identify aquatic mosquito habitats, the choice of EO data and analysis methods is largely dependent on the local vector ecology and available resources. Anopheles funestus, the second most important malaria vector in Africa, typically breeds in large permanent or semipermanent water bodies with emergent vegetation [10][11][12][13]. e creation of these habitats has been associated with agricultural practices such as rice cultivation [14][15][16], pastures [17], cultivated swamps [18][19][20], and canals and drainage ditches used for irrigation [10-12, 21, 22]. e relatively large and stable qualities of An. funestus habitats suggest the utility of satellite-based EO data to identify breeding sites despite coarser spatial resolution and less frequent coverage of satellites over target areas. e identification of breeding sites can allow for targeted malaria control using larval source management, or LSM. e goal of LSM is to reduce adult mosquito populations by targeting the immature aquatic stages of disease vectors through environmental, biological, or chemical modification of the larval habitats [23]. Using LSM in areas where this vector dominates has been suggested as a potential tool to complement existing malaria control efforts due to the "few, fixed, and findable" nature of their breeding sites [13]. As satellite-based EO data are often available freely through government sources (e.g., NASA and the European Space Agency), these data sources may be more attractive for control programmes operating in resource-limited environments. However, although satellite-based EO data have been used to characterise risk factors for An. funestus breeding sites (e.g., [24,25]) and describe An. funestus species distributions (e.g., [26]), predictive models integrating EO data have not been routinely applied to identify An. funestus breeding sites as targets for vector control. A previous study found the pixel-based analysis of spectral signatures of EO data were insufficient to identify An. funestus breeding sites [27]. Subsequent studies have visually identified these breeding habitats from very high-resolution aerial imagery collected by drones [28]. In comparison to satellite-based EO data, aerial drone surveys are more resource intensive and require field-based teams, but they may generate data that are more easily interpretable and actionable for local control programmes.
Despite the medical importance of this vector, and the nature of breeding water bodies which intuitively should be relatively easy to detect, An. funestus breeding habitats are difficult to locate [13]; however, there is a paucity of knowledge on their breeding ecology in Côte d'Ivoire. Developing operationally feasible methods of identifying potential breeding sites for An. funestus requires integrating different sources of EO data with local entomological knowledge. Baseline knowledge of the spatial distribution and characteristics of their larval habitats are required and can be acquired by conducting larval surveys during the appropriate months. An important requirement in the collection of larval distribution data is to ensure that all habitats present in the study area are sampled equally. Ensuring such spatial representativeness can help to build an understanding of the distribution of breeding sites across the full range of land types and habitats present in the study site, rather than an overrepresentation of habitats which are easy to access or are suspected to be associated with larval presence [29,30].
One method of ensuring environmental and spatial representativeness in a larval sampling frame is by classifying the region of interest into environmental strata such as land cover and randomly sampling each stratum proportionally.
is ensures a random spatial distribution of sampled sites, with an even number of sampled sites for each environmental stratum. EO data such as drone imagery and multispectral satellite data have been used to develop stratified sampling designs for malaria vector larval surveys [29,31]. In the case of An. funestus breeding habitats and their documented rarity [13], a representative sampling design can be integrated with local entomological and hydrological knowledge.
is should ensure that the aquatic habitats which could potentially serve as sites for An. funestus oviposition within the study site are located. e locations of potential An. funestus breeding sites identified through the representative sampling frame can then be used to guide a second stage of intensive larval surveys. Such adaptive sampling designs, which use information from previously collected data to inform future sampling sites, have been used to aid the field studies in locating rare or hard-to-find phenomena such as plant species or disease hotspots [32][33][34].
While ground-based studies and aerial mapping may be essential for identifying fine-scale characteristics of water bodies and determining vector productivity, regional mapping of large areas can only be achieved by using satellite-based EO data. e utility of these different data sources and image analysis methods are highly dependent on the identification of fine-scale landscape factors influencing vector distributions. Deep learning methods of image classification are highly reliant on the availability of sufficient training data on areas with and without vector breeding sites. is necessitates designing a spatially representative sampling approach to identify breeding sites and key landscape features across areas of interest. e aim of this study was to characterise the aquatic habitats of An. funestus across an agricultural gradient in Côte d'Ivoire using paired drone and entomological surveys. In this paper, we outline a technical workflow to carry out these aims and to collect data to enable development of deep-learning approaches to identify An. funestus breeding sites. e specific objectives were to (i) develop a land cover classification scheme with categories relative to An. funestus breeding and to categorise drone images accordingly; (ii) to develop a statistically rigorous larval sampling frame for An. funestus which is environmentally and spatially representative of the study site using remote sensing data and which is spatially adaptive to account for the rarity of breeding sites; and (iii) to conduct paired larval and high resolution drone surveys and collect independent validation data for the land classification scheme.

Study Area.
is study was conducted in two villages southeast of Bouaké in the Gbêkê region of the Vallée du Bandama district of central Côte d'Ivoire. In this region, malaria is endemic with year-round transmission peaking during the rainy season between May and October [35]; whilst the dominant malaria vector species is An. gambiae sensu lato, An. funestus s.l., An. nili, and An. ziemanni are also present. e average annual rainfall is 1200 mm, and there is an average annual temperature of 25.8°C [14].

Study Design and Site Selection.
is was a cross-sectional study pairing larval surveys of An. funestus with drone surveys to capture the landscape composition that drives the fine-scale spatial distribution of An. funestus breeding sites in two villages in central Côte d'Ivoire. e two villages were chosen based on adult mosquito catch data collected during a randomised controlled trial on the impact of structural interventions on malaria transmission from 2017 to 2019 [35]. Specifically, "control" villages (i.e., without structural interventions) with high mean nightly catches of An. funestus (using human landing catches and CDC light traps) as well as high proportions of An. funestus compared to overall Anopheles numbers were selected ( Figure 1). e region of interest was set as a 3 km circular buffer (12.6 km 2 ) surrounding the centroid of each village ( Figure 2), corresponding to the average mosquito flight range [36].

Defining Land Classes of Interest.
We reviewed the scientific literature for descriptions of land cover types and water body characteristics which are associated with An. funestus larval and adult stages. is informed the development of a hierarchical land classification scheme for the Côte d'Ivoire study site. An initial scoping assessment of freely available high-resolution satellite data (https://earth. google.com/web/and https://www.planet.com/explorer) was performed. is assessment allowed us to determine which of the land classes and water body types identified from the literature review were present in the Côte d'Ivoire study sites and were discernible from high-resolution remote sensing data. e presence of these land classes and water body types were subsequently validated by field staff who were familiar with the landscape. e resulting land classification scheme was used in ground truthing field surveys.

Sampling Frame.
Two stages of sampling were used to locate potential An. funestus larval habitats. e first was a randomly stratified sampling frame with targeted sampling, where the landscape was divided into environmental strata using remote sensing data, and each stratum was sampled proportionally. e second sampling stage was an adaptive sampling strategy to account for the low predicted number of An. funestus breeding sites within the study area.

Randomly Stratified and Targeted Sampling Stage.
A hexagonal grid was used to partition the region of interest into sampling units of 10,000 m 2 . Hexagons were used because their centroids are equidistant to the perimeter of each edge. Sentinel-2 MSI: MultiSpectral Instrument, Level-2A, wide-swath high-resolution multispectral imagery was retrieved from Google Earth Engine (GEE). e imagery was gathered from 1st January 2019 to 31st December 2020, all with less than 20% cloud coverage. e imagery was used to compute the following vegetation indices: normalised difference vegetation index (NDVI), normalised difference water index (NDWI), enhanced vegetation index (EVI), and soil adjusted vegetation index (SAVI). e GEE script with the preprocessing of the Sentinel-2 imagery and computation of the vegetation indices is provided in Supplementary Materials 1. e equations used to compute the vegetation indices are as follows: (1) A k-means clustering algorithm was then applied to group hexagons into four environmental strata based on their multispectral and environmental profiles (Figure 2(a)).
Based on the average time taken to sample a hexagon for a team of three fieldworkers, a sample size of 70 hexagons per village was chosen for this initial sampling stage. e 70 hexagons were randomly selected proportional to the total number of sampling units of each stratum across the region of interest (Figure 2(b)). ey were then assigned a unique identifier code based on the village, stratum, and position within the area of interest. To locate the randomly selected sampling unit within the field and to ensure that sampling was carried out within the boundaries of the hexagon, an offline satellite map overlaid with the hexagons of interest was created using QGIS v3.12 [37] as an MBTile. e maps were uploaded to the OpenDataKit (ODK) software [38] onto Android tablets, and the GPS coordinates of the tablet were visualised on the offline map using the geotrace function. Sampling one hexagon consisted of collecting three ground truthing points and sampling all potential larval habitats for larvae. At each ground truthing point, the GPS coordinates and photos were collected using digital forms installed in tablets through ODK, and the predominant land cover type was recorded according to the land cover classification scheme, which captures agricultural, savannah, and built environments (Figure 3).
Due to the scarceness and documented difficulty in finding An. funestus breeding sites [13], we conducted targeted sampling alongside the stratified sampling by actively searching for water bodies in the vicinity of the randomly stratified hexagons. ese surveys were aided by two members of the village, who are very familiar with the locations of water bodies and their seasonality. e geotrace function on the ODK software was also used to record the GPS coordinates of the hexagons which were visited during the targeted surveys.

Adaptive Sampling Stage.
Using the GPS coordinates of the water bodies recorded during the first stage of the sampling frame, we conducted an intensive second stage of larval sampling.
is involved returning to all previously identified water bodies and sampling for larvae and prospective searches within these areas to find similar water bodies nearby, aided again by the village member's knowledge.

Larval Survey, Environmental DNA Collection, and Water
Body Characteristics. Larval surveys were conducted at every water body found using the two sampling strategies. Each sample included larval dips, an environmental DNA sample, and the collection of water body characteristics.
Larvae were sampled using standard larval dippers (BioQuip 350 mL), with five dips conducted at each sampling spot. For water bodies that had a clear distinction between areas with and without emergent vegetation, two samples were performed accordingly. Counts of Anopheles immatures were recorded according to their developmental stage: early (L1/L2), late instars (L3/L4), and pupae. ey were transferred into QR code-labelled vials for transportation to the entomological laboratory, where they were reared into adults for morphological species identification using a dissecting microscope. e presence or absence of culicine larvae was also recorded. e standard operating procedures for larvae collections and rearing are outlined in Supplementary Materials 2A.
In order to confirm the absence or presence of An. funestus larvae, water samples were also collected at every sampling spot for environmental DNA (eDNA) analysis. ese were collected into QR code-labelled 50 mL Falcon tubes and stored in a freezer for transportation to the entomological laboratory, where they were stored at − 80°C. eDNA was extracted according to the protocol outlined in Supplementary Materials 2B. Taqman quantitative PCR assays were then used to detect eDNA from An. funestus with probes targeting the species-specific sequences of five An. funestus species available in the National Center for Biotechnology Information (NCBI) database [39].
Regardless of Anopheles' absence or presence, the physicochemical parameters of each water body were collected in ODK forms. e parameters of interest and the equipment used to collect them are listed in Table 1. Red-green-blue (RGB) mapping was conducted in the areas surrounding the water bodies found during the larval sampling. Drones were flown at an altitude of 150 m, generating imagery with a pixel size of ∼4 cm. Photographs were taken using a "fast" picture trigger mode with a 70% forward and horizontal overlap to generate seamless image mosaics. Drones were programmed to fly 650 × 600 metres (0.40 km 2 ) parcel sizes from takeoff points identified using grids overlaid on satellite imagery (

Data Management.
To ensure confidentiality, field data (larval and ground-truth surveys) were encrypted, stored on secured tablets, and sent to a private server. Drone imagery was kept in an encrypted hard drive on a password-protected computer. Morphological identification of adult Anopheles mosquitoes was transcribed and stored in an encrypted folder.

Orthomosaic Construction.
Drone imagery was combined into high-resolution image mosaics through photogrammetric processing using AgiSoft Metashape Professional (https://www.agisoft.com). First, drone imagery, which is automatically recorded with GPS coordinates, is imported into MetaShape and processed to construct an orthomosaic (a georeferenced mosaic of overlapped images). Second, photographs are aligned (accuracy: high; generic and reference preselection active; key point limit: 40,000; adaptive camera model fitting active) and points are matched between overlapping images to build a sparse point cloud. ird, the sparse point cloud is used as a foundation to build a dense point cloud (quality: high; and depth filtering: moderate). Fourth, the dense point cloud is used to classify ground points. Fifth, a digital elevation model (DEM) is built (geographic projection using WGS 84 (EPSG: 4326); pixel size of 14.6 cm; interpolation: extrapolated; all point classes to generate a digital surface model). Finally, an orthomosaic is built (input surface: DEM; blending mode: mosaic; pixel size of 3.6).

Classifying Remote-Sensing Imagery.
e drone data were supplemented with high-resolution (0.58 m spatial resolution, RGB spectral resolution) commercial satellite data from DigitalGlobe© WorldView2 (https://discover. digitalglobe.com/), collected in September 2019 and covered the region of Gbêkê. ese remote-sensing data were classified into land cover types relevant to An. funestus breeding site spatial distribution, according to a refined version of the classification scheme that was developed to ensure parsimony and applicability to An. funestus breeding ecology. e drone images were labelled according to the simplified land classification scheme by object detection methods using the free and collaborative online image labelling tool Groundwork (https://groundwork.azavea.com). Each label was validated by local field technicians and researchers from Côte d'Ivoire who are familiar with the landscapes. Labelled data were exported as georeferenced GeoJSON files.

Results
A summary of the technical workflow is presented in Figure 4.

Classification.
Based on a review of published literature, we identified key characteristics of potential An. funestus breeding sites (Table 2). ese included irrigated agricultural landscapes as well as large water bodies with emergent vegetation. We additionally identified key areas in which An. funestus was unlikely to be detected, including roads and other built-up areas. Based on these findings and local ecology, we developed a land classification scheme specific to this area and An. funestus habitats ( Figure 5).

Stratification.
e optimal number of clusters from the K-means algorithm was k � 4, giving us 4 environmental strata which the hexagons were assigned to (Figure 2(b) and Table 3).

Larval Sampling.
e larval and drone surveys were conducted from 24th June to 31st July 2021. e second rainy season in Côte d'Ivoire occurs from July to September. is season, however, was remarkably dry. We sampled 189 potential mosquito breeding habitats, of which 119 (63%) were positive for larvae of the Anopheles genus, found mostly using targeted sampling (Table 4). Anopheles larvae-positive habitats were more likely to be semipermanent or permanent water bodies with emergent vegetation and stationary water flow (Table 5). e anopheline species identified included An. gambiae, An. nili, An. pharoensis, An. ziemanni, and An. funestus. Nine sampling points (4.8%) were positive for An. funestus larvae. Culicine larvae were also recorded in 120 water bodies (63%) but were not counted nor speciated. eDNA samples were collected for every sampled breeding habitat. At the time of writing, the eDNA results have not been finalised.
A total of 671 hexagons were visited during the study, 121 according to the randomly stratified sampling design and the remaining 550 were visited during targeted and adaptive surveys ( Figure 6). is makes up 11.6% of the total hexagons in the full region of interest.

Drone Surveys.
We planned on performing 176 drone flights of 650 m × 600 m to cover the study area of 68.64 km 2 , but due to technical difficulties with the drones we were able to complete 78 flights, covering 30.42 km 2 (Figure 7(a)). e rectangular grid was used as a guideline for the drone flights, although there could be a gap between the grid and the orthomosaic generated by a single drone flight (Figure 7(b)).

Ground-Truthing and Image Classification.
From the available drone imagery and ground-based observations, we developed a workflow to manually classify different land cover categories of interest. Researchers used online tools to manually digitise the extent of different land cover categories ( Figure 8).

Discussion
Within this study, we developed a systematic approach to collect and label paired entomological and aerial EO data for the development of deep-learning approaches. While drones are increasingly accessible to vector control programmes, critical gaps remain on how to design data collection and analysis methods. is study outlines a technical workflow and procedures for collecting these types of data, providing a template which can be easily adapted to collect data on vector breeding sites in other ecological contexts. is allows the identification of associations between landscape characteristics and vector breeding sites while also developing training data for deep-learning algorithms.
A key finding of this project was that the development of an environmentally and spatially representative sampling design from freely available satellite data is a feasible method for initial surveys of a landscape for a rare phenomenon   with aquatic vegetation and algae [28] Tanzania Yes Ephemeral river channels [11] Kenya Highlands Streams, drainage ditch habitats [27] Kenya -Shaded water [12] Ethiopia Irrigated landscape Puddles of lowland dam, and irrigation canals 8 Journal of Environmental and Public Health when supplemented with local knowledge of the environment and a follow-up stage of intensive sampling. Larval surveys are a common form of entomological sampling; however, their sampling frames can often be designed with convenience or prior knowledge in mind. For example, a habitat type which is known to be associated with a certain vector may be oversampled, leading to biases in the representation of land types present. In the case of the breeding ecology of An. funestus, for which there is a paucity of knowledge, it is especially important to sample the full range of habitats present in the study area, to gain knowledge of the areas where aquatic habitats are present but also where they are absent. It is also crucial that a sampling frame will pick up the species of interest and adjust the design to ensure this if necessary. In our study, it is clear, from the low number of potential habitats picked up through the random stratification alone, that this sampling frame required an extension in order to be confident that we could locate rare An. funestus aquatic habitats [13,49]. We extended the sampling frame to account for the rarity of these habitats by searching in the terrain surrounding the randomly selected hexagons, aided by local knowledge of water body distribution. e targeted surveys were a successful method of locating water bodies, leading to 107 potential larval habitats, often in areas  Journal of Environmental and Public Health which would not have been visited based on convenience sampling. In the case of An. funestus breeding habitats in Côte d'Ivoire, we found that this sampling framework served as a strong initial survey which allowed us to locate potential breeding sites. When followed up with an adaptive sampling stage of intensive larval surveys, we were successful in identifying a total of nine An. funestus breeding sites.
In this study, we stratified the landscape using Sentinel-2 multispectral bands and vegetation indices which were derived from them. Other sources of freely available satellite remote sensing and spatially explicit data, such as soil type, hydrological indices, and weather reanalysis and forecasting, could have been used for the same stratification purposes and could have produced different numbers and locations of strata. Nonetheless, to validate the usage of Sentinel-2-derived variables to differentiate the landscapes of interest for An. funestus breeding, we collected ground-truthing land cover data and performed a sensitivity analysis. Future studies could build on existing data to iteratively develop sampling frames including all key metrics associated with the vector of interest. e nine An. funestus larval habitats found were large, semipermanent water bodies, which made up a small subset of our total sampled habitats. ese findings support previous descriptions of An. funestus habitats as "few, fixed, and findable" [13,49]. e cross-sectional nature of our study, however, did not allow us to confirm whether these water bodies are breeding sites consistently over time, where female An. funestus returns every breeding season. To investigate this, the same sampling points could be revisited and sampled for An. funestus larval presence in the following rainy season. e results of our larval survey could be used in future studies to describe the distribution of An. funestus breeding sites within this region.
We trialled a novel protocol for the collection of eDNA samples from potential breeding sites to show the presence of An. funestus in water bodies where their larvae were not collected or identified. e previous detection of eDNA from artificial Anopheles larval habitats has demonstrated the ability to detect 0.002 larvae/ml, and further validation of our protocol with field samples is ongoing [50]. e SOPs provided and instructions on these methods should serve as good guidelines for future researchers.
One of the key outputs of this study is the development of a workflow using open-source tools to collect and label imagery. We found that the use of offline maps loaded to the ODK forms was essential for efficient navigation in the field and the collection of drone data. e inclusion of gridded guidelines for drone surveys and previously sampled water bodies allowed for efficient mapping of the sampled sites and   their surrounding landscapes in real-time. In addition, the online platform Groundwork for categorising our drone imagery provided a user-friendly, collaborative option for the often tedious task of labelling imagery for training datasets for deep-learning methods. e platform enabled good project management with the ability to oversee labellers' progress, and all labelled images were verified by experts who are familiar with the environment. is is an efficient tool for building a catalogue of training data for wider projects using remote-sensing data with the aim of landscape classification using deep-learning methods. Using a landscape classification system which is based on the literature of An. funestus breeding ecology should ensure its applicability for further research into land cover risk factors for the vector's larval habitats and predictive mapping based on land cover. Another important consideration in projects involving drone mapping and entomological sampling is the seasonal changes which may occur during the project. is is especially important in rapidly changing environments, for example, in an area undergoing deforestation events, and during changing seasons, for example, in the transition months between dry and rainy seasons. ese biotic and abiotic changes can have significant impacts on mosquito ecology [51] and can also be visible in remote-sensing imagery. One way to adjust a drone and entomology protocol to account for this is to aim to collect drone and entomology data simultaneously. Our protocol aimed to map a potential larval habitat no more than two days after sampling, to ensure that the images collected reflected the characteristics of the water body at the time of sampling. However, due to logistical issues, there was a lag of over two weeks when mapping some water bodies. e extent of the area to be mapped also resulted in a timespan of six weeks between the first and the last drone flights. In the case of this study, we do not expect these time lags to have a major impact on the imagery analysis, as the rainy season which was due to commence during our data collection period arrived late, and the weather remained relatively stable throughout. We do, however, recommend taking time lags into account when planning drone and entomological surveys.
is project was somewhat limited by technical issues encountered with drones. On two occasions, our drones lost connection with the remote controller while in flight, resulting in irreparable damage. Although there were drone merchants in Cote d'Ivoire, procurement of new drones was slow for the short time span of this project. e original protocol for this study aimed to map a 3 km buffer around the study villages; however, due to technical issues this was  e protocol changed to reflect this, and we focused mapping efforts on sampled water bodies and the landscape immediately surrounding them, rather than mapping the full region of interest. Technical issues are an important factor to take into account when planning a project which involves drone mapping, and we recommend adding contingency time and budget to any fieldwork planning. We strongly suggest always using the equipment according to the manufacturer's instructions and taking special caution to keep equipment from overheating, especially in sunnier environments. It is useful to establish in-country connections with drone vendors prior to beginning fieldwork, and if possible, to have backup equipment should any technical issues arise.

Conclusions
is paper provides a technical workflow on the development of an environmentally and spatially representative sampling design using remote-sensing data and the collection of drone imagery and larval distribution data for deep-learning methods. We hope that it can act as a useful guide for the integration of drone and vector data collection for future research. We found that using a randomly stratified sampling design which is representative of the landscape supplemented by local knowledge, and a second stage of intensive sampling, serves as a strong framework for identifying the "few, fixed, and findable" breeding sites of An. funestus in our study villages.
Data Availability e data that support the findings of this study can be obtained from the corresponding author upon reasonable request. Disclosure e opinions expressed here belong to the authors and do not necessarily reflect those of A4NH or CGIAR.  Figure 8: An example of a drone image categorised into four of the land classes from the refined land classification system. and other staff from AfricaRice for their help with drones and other field equipment. e authors also acknowledge Dr. Joseph Timothy from the London School of Hygiene and Tropical Medicine for providing the methodology behind the offline maps used in the field. e authors acknowledge all the staff involved in the entomological work in the Eave Tube Phase 3 Randomized Control Trial in Ivory Coast (2016-2020). e authors appreciate Maria Bernardez Agrafojo, Anastasia Ioakeimidou, and the Network for the use of drones for malaria vector control (MACONDO) for their technical support. is work was undertaken as part of, and funded by, the CGIAR Research Program on Agriculture for Nutrition and Health (A4NH). is paper has not gone through the standard peer-review procedure of A4NH's Lead Center, IFPRI. Additional funding was provided by MACONDO, "International collaborative network for the integration, standardization and assessment of the use of drones in malaria vector control strategies" funded by the Global Challenges Research Fund (GCRF) Networking Grants (GCRFNGR4\1140). KMF is supported by a Sir Henry Dale Fellowship jointly funded by the Wellcome Trust and the Royal Society (Grant no. 221963/Z/20/Z).