Comparative Analysis on Applicability of Satellite and Meteorological Data for Prediction of Malaria in Endemic Area in Bangladesh

Relationships between yearly malaria incidence and (1) climate data from weather station and (2) satellite-based vegetation health (VH) indices were investigated for prediction of malaria vector activities in Bangladesh. Correlation analysis of percent of malaria cases with Advanced Very High Resolution Radiometer- (AVHRR-) based VH indices represented by the vegetation condition index (VCI—moisture condition) and the temperature condition index (TCI—estimates thermal condition) and with rainfall, relative humidity, and temperature from ground-based meteorological stations. Results show that climate data from weather stations are poorly correlated and are not applicable to estimate prevalence in Bangladesh. The study also has shown that AVHRR-based vegetation health (VH) indices are highly applicable for malaria trend assessment and also for the estimation of the total number of malaria cases in Bangladesh for the period of 1992–2001.


Introduction
Malaria is a known cause of febrile illness in Bangladesh for a long time. Nearly, 200,000 malaria cases are reported each year in Bangladesh for population of 140 millions. This number can fluctuate depending on weather conditions [1][2][3]. Malaria transmission in Bangladesh is mostly seasonal and limited to the border regions with Myanmar in the east and India in the north (Figure 1). Out of country's 6 administrative divisions (containing 64 districts), Dhaka, Sylhet, and Chittagong (12 districts) are malaria endemic [4][5][6]. These 3 divisions contribute nearly 98% of the total Bangladesh malaria morbidity and mortality statistics reported each year [7,8]. Around 27 million people (20% of the total Bangladesh population) live in malaria endemic area [9,10].

Malaria and Climate
Three principal environmental factors for mosquito activity and malaria transmission are important: temperature, humidity, and rainfall [11,12]. Temperatures within the range of 20 • C-30 • C affect malaria transmissions in several ways: (a) development of Anopheles is shortened (b) biting capacity of mosquitoes is increased, and (c) mosquitoes survive long enough to acquire and transmit the parasite. Temperatures lower than 16 • C or higher than 30 • C have a negative impact on the growth of the mosquitoes [13]. Mosquitoes breed in water habitats, thus requiring just the right amount of precipitation in order for mosquito breeding to occur. The effect of rainfall on the transmission of malaria is very complicated varying with the circumstances of particular geographic regions and depending on the local habits of mosquitoes [14]. Anopheles dirus (AD) females stay active during the period when precipitation exceeds 50 mm per month. However, a combination of large rainfall and hot weather during June-August might reduce mosquito activity. Rainfall also affects malaria transmission because it increases relative humidity and modifies temperature, and it also affects where and how much mosquito breeding can take place. Plasmodium parasites are not affected by relative humidity, but the activity and survival of Anopheline  mosquitoes are. High relative humidity allows the parasite to complete the necessary life cycle, so that it can transmit the infection to several persons [15]. If the average monthly relative humidity is below 60%, it is believed that the life of the mosquito is so shortened that there is no malaria transmission [16]. Monthly temperature and humidity are stable from year to year (variations are 1 • C and 1%, resp.), but precipitation has substantial interannual variability.

Data and Methods
Three data types were used in this study: malaria epidemiological statistics, satellite data, and meteorological data from ground stations.
Malaria statistics were collected from the Directorate General of Health, Bangladesh's Ministry of Health from 1992 to 2001. Malaria data were represented by annual number of people with fever tested in local hospital. Standard blood slide examination was done by a medical officer trained in malaria microscopy, according to the WHO guidelines, and slide readers were blinded to the clinical diagnoses. Ten percent each of the positive and negative slides  were reexamined blindly by the National Malaria Reference Laboratory to evaluate the accuracy of our study's slide examination results. The diagnostic criteria were fever or history of fever within the last 48 hours; an absence of signs of other disease; inadequate antimalarials (or none) during the 4 weeks prior to present illness [4]. The hospital data were aggregated by local administrative unit health centers and further by administrative districts [16,19]. Available malaria statistics are the number of patients (patient's total PT) whose blood sample was tested for malaria and the number of positive malaria cases (PMCs). Finally, the number of malaria cases was expressed in percent as (PMC/PT) * 100. The dynamics of annual PT and PMC for Bangladesh during the investigated period are shown in Figure 2. Satellite data were collected from the Global Vegetation Index (GVI) data set [20]. The GVI is produced by sampling and mapping the 4-km daily radiance in the visible (Ch 1 , 0.58-0.68 μm), near infrared (Ch 2 , 0.72-l.1 μm), and thermal band (Ch 4 , 10.3-11.3 μm and Ch 5 , 11.3-12.3 μm) measured on board NOAA polar-orbiting satellites, to a 16km map. These maps, including the Normalized Difference Vegetation Index, NDVI = (Ch 2 − Ch 1 )/(Ch 2 + Ch 1 ), solar zenith angle, and satellite scan angle, are composited over a 7-day period.
The vegetation health (VH) indices were developed from Normalized Difference Vegetation Index (NDVI) and brightness temperature (BT). The data processing included removal of high-frequency noise from the annual time series Journal of Tropical Medicine 3 of NDVI and BT, approximation of annual cycle, calculation of multiyear climatology, and derivation of VH [21].
High-frequency temporal noise in NDVI and BT related to fluctuating transmission of the atmosphere, sun/sensor geometry, bidirectional reflectance, random noise, and others was removed by statistical smoothing of NDVI and BT annual time series for each pixel during the entire period using a combination of median filter and least square technique. Climatology of NVDI and BT seasonal cycle was approximated by multiyear maximum (MAX) and minimum (MIN) weekly values taken from the smoothed data. The MAX and MIN for each pixel and week were calculated from twenty years of historical GVI data [20]. The (MAX-MIN) criterion was used to describe and classify weatherrelated ecosystem's "carrying capacity," and therefore, it represented the climatology of those extreme weather-related fluctuations in NDVI and BT. The NDVI-and BT-derived weather component was expressed as the Vegetation and Temperature Condition Indices (VCI/TCI) [21]. Equations (1) and (2) show numerical approximation of VCI and TCI where NDVI, NDVImax, and NDVImin (BT, BTmax, and BTmin) are smoothed weekly NDVI (BT) and their multiyear absolute maximum and minimum, respectively; the VCI and TCI change from 0 to 100, reflecting changes in moisture and thermal conditions from extremely unfavorable (vegetation stress) to optimal (favorable); the VCI and TCI values around 50 estimates near normal conditions and values <50 indicate vegetation stress including average intensive when the index equal 0 [10]. Bangladesh has 34 meteorological stations and these are not dense. Daily temperature, rainfall, and humidity data were collected from the weather stations located in these 3 divisions from 1992 to 1999. Ten-day average temperature (T • C) and humidity (H%) and 10-day total rainfall (R, mm) data were generated from collected data. Regional averages of T, H, and R were calculated as average values from weather stations.
Meteorological parameters (T, R, and H) were expressed as a deviation from mean value during 1992-1999 (Percent of mean for R, difference from mean for T and H) in order to evaluate weather anomalies during the annual cycle.

Result and Discussion
The long-term tendency in malaria cases dynamics shown in Figure 3 was approximated by linear equation (3), and weather-related variations around the trend were expressed in percent as a deviation from the trend line (4) [10,22]  where Y i is a percent of malaria cases in a year number i; Y trend is a long-term trend in a region during 1992-2001; a 0 is an intercept; a 1 is a slope; DY i is a deviation from trend (%) in year i. Figure 4 shows the dynamics of the Pearson correlation coefficients (PCCs) between end of each year DY with weekly VCI and TCI during 1992-2001. Analysis of the PCC in Figure 4 indicates that there are two types of dynamics in the investigated areas. Dhaka and Sylhet divisions have erratic correlation dynamics, and Chittagong division has wellpronounced dynamics, corresponding to the main features of mosquito's response to weather and correspondingly their ability to spread malaria.
Following Figure 4, during the cool season (November through March) when the number of malaria cases is smaller, correlation of DY with VCI and TCI is low indicating that 4 Journal of Tropical Medicine VH indices have low predictive ability. From April, when a warm season starts and mosquito activity intensifies the correlation rapidly increases reaching maximum of −0.50 for VCI and 0.60 for TCI during June-July (weeks 24-28). After those maximums, the correlation gradually decreases to a near zero level by the beginning of the next cool season in November after week 40. A negative correlation of DY with VCI indicates that more malaria cases (DY is above the trend) are developing for dryer condition (VCI < 50 or reduced vegetation greenness, (1)). Oppositely, less malaria cases (DY is below the trend) are recorded for moist conditions (VCI > 50 or larger vegetation greenness, (1)). This confirms that in average wet climate excessive rainfall during monsoon season negatively affects mosquito activity and their ability to transmit malaria. Regarding thermal conditions, larger number of malaria cases (DY is above the trend) is associated with TCI greater than 50, which indicates cooler weather (see (2)). Smaller number of malaria cases (DY below trend) is associated with lower TCI (below 50, hotter weather, (3)) [10]. Therefore, the investigation of VCI and TCI as predictors was performed for all three malaria divisions.
Correlations between DY (percent of malaria data) and deviation from mean temperature DT, deviation from average rainfall DR, and deviation from relatively mean humidity DH are shown in Figure 5. Figure 5 shows that all the selected parameters: temperature, rainfall, and relative humidity obtained from the network of ground-based meteorological stations are poorly correlated with the trend of actual malaria cases, and none of these parameters can be a proxy for the malaria trend for the period of 1991-1999.
Majority of annual malaria cases occur in summer. Therefore, correlating annual malaria cases with VH and meteorological data pursued two goals: (1) if timing (summer) of the highest correlation is right, (2) if the correlation for this period is strong, (3) if before summer and after summer the correlation is not strong, and (4) if transition from spring to summer and from summer to fall is gradual. Following Figure 4, (correlation with VH) these four goals were met because VH indices are assessing cumulative conditions (have some "memory"). Following Figure 5 (correlation with decadal meteorological indices), all four goals were not met because P, T, and RH are not showing cumulative conditions (do not have "memory").
The result of correlation analysis in the Figure 4 was used to develop regression equations. Several options were investigated using either TCI (thermal condition) or VCI (moisture condition) only or both indices for the weeks of the highest correlation [10,19]. General form of regression equation when both indices were used is written bellow where a 0 , a 1 , and a 2 are coefficients, i is the week number, and DP is predicted number of malaria cases (%) deviation from trend. The tested variables are presented in Table 1 with the corresponding multiple correlation coefficients (MCCs), root mean square error (RMSE), and F criteria. Analysis indicates that for the two minor divisions (Dhaka, Sylhet),  the MCC is not much different than for individual weeks, but RMSEs are quite large (30%-35%). Such high errors could be expected since the area of minor divisions is very remote; population is not large and spread much over diversified ecosystems and environmental conditions. In spite of large RMSE, several models were selected for further analysis. For Dhaka division, models 2 and 3 showed slightly higher MCC and lower RMSE than others. Model 3 has some advantages in terms of early indication (week 28) of possible malaria epidemic. For the Sylhet division, model 3 was selected with best estimates.
The smallest RMSEs were for the main malaria division of the country, Chittagong. For Chittagong division, the best models based on the MCC, RMSE, and F parameters were numbers three and four. Model four (TCI 26 and TCI 30 predictors) provides slightly larger MCC and smaller RMSE. But in terms of timeliness of prediction, model 3 provides advanced warning. Final equations of the best accepted models for the three malaria prone divisions are shown in Table 2.
In addition to analysis of MCC, RMSE, and F, we also used t-test for regression coefficients with a significance of 5% and 10%. Following Table 1, for Chittagong division, model four was selected as the best since t-test value 1.58 for predictor TCI 26 was higher than critical value (1.38) with 10% significance. It is interesting to note that when TCI and VCI for the same week 26 were selected as predictors in model 3, the performance of statistically significant predictor

Model Validations
Further analysis included independent validation of models (Table 2). Since the training data is short, the Jackknife technique was used as validation tools. For each model, one year of malaria and satellite data were excluded from the 1992-2001 dataset. A model (Q = f (VCI, TCI)) was developed with one year out and this model was applied to the removed year to predict the number of malaria cases deviating from trend (Q) based on satellite data of the eliminated year. Then, the eliminated year was returned to the data set and the next year was removed for model development and testing [23]. Each year data were removed one at a time and the candidate model was fit nine times to the eliminated year. As the result of this procedure, nine independent predictions were obtained, where f is an arbitrary function. Finally, in each of the predictions, the number of malaria cases (P) for the eliminated year (i) was estimated from equation (5). In addition, the coefficient of determination (R 2 ) for the numbers of independently predicted and observed malaria cases, the bias (B), percentage of relative bias (RB percent), and root mean square error (RMSE) for this year was estimated using (6), (7), and (9) where P is predicted malaria cases (%); B is average bias for all years; RMSE is a measure of the precision of the predicted value and should be as small as possible for unbiased precise prediction. Models were independently tested, 3 and 4 for Chittagong and 3 for Dhaka and Sylhet divisions (Table 3). In the estimation of models performance estimation we followed such rules: (1) for the entire model R 2 greater than 70 and RMSE less than 15%; (2) for individual years bias (B) less than 2% and relative bias (RB) less than 10%.
The independently validated results presented in Table 3 and Figure 6 show that for the entire models R 2 and RMSE criteria have been met for Chittagong (model 3)  Figure 6: Correlation between independently predicted and observed number of malaria cases (% of malaria cases from the total number of people who come to the regional hospitals with fever). and Dhaka. However, the analysis of annual model's performance showed that only Chittagong models perform reliably. In 8 years for Chittagong division, bias was less than 2% and RB < 10%. Regarding models for minor malaria regions, the annual results of testing are negative, since as seen in Table 3, RB indicates strong deterioration of model's performance after 1996 (Sylhet RB was 8%-16% and Dhaka RB was 19%-33%), while prior to 1996 a RB was 50% less. Such explanation can be because Bangladesh government developed very comprehensive measures to combat malaria in Sylhet and Dhaka. During 1996-2001, this resulted in a considerable reduction of malaria cases (as shown in Figure 3), so the data must be interpreted with caution because of the decline in surveillance activities in the country over the past few years (WHO 1999).
Bangladesh government has also undertaken malariafight measures in 13 districts from Chittagong, Sylhet, and Dhaka divisions. However, the effectiveness was not as good as in the minor malaria region because a large number of people were exposed to malaria, especially among poor in Chittagong division (93% of the entire Bangladesh people were affected). Smaller effectiveness of malaria combat measures is also indicated by an increase in the number of malaria cases during 1992-2001 (as shown in Figure 3).
It would be important to emphasize that although Chittagong models met both performance criteria (for the entire model and individual years) in some years B, and RB exceed the threshold's level. In 1997 (RB = 11%), models for Chittagong overestimated the number of malariaaffected people. First, we should emphasize that 1997 was a strong El Niño year (positive sea surface temperature anomaly in the tropical Pacific) [24,25]. As a result, the southeastern monsoon in Bangladesh was delayed by one month resulted in a long period of extremely hot and dry weather. Figure 7 shows TCI and VCI dynamics in Chittagong. During the investigated years, severe drought (TCI = 15-30) developed from June through October. Such conditions produced much stronger impact on mosquito development, disrupted reproductive mosquito cycle, and intensified malaria transmission, which resulted in smaller number of malaria cases compared to the prediction. To characterize such extreme conditions, model's predictors should characterize longer period (several weeks and even months), which is not possible now due to a limited statistical sample.

Conclusions
It is shown in the presented study that such parameters as the amount of rainfall relative humidity and temperature obtained from the network of ground meteorological stations cannot be used as proxies for malaria trend in the three divisions of Bangladesh. It is also shown here that the satellite data and in particular AVHRR-based VH indices (VCI and TCI) are highly applicable as proxies for the malaria trend assessment and also for the estimation of the total number of malaria cases in divisions of Bangladesh for the period from 1991 to 2001.