Use of remote sensing to identify spatial risk factors for malaria in a region of declining transmission: a cross-sectional and longitudinal community survey

Background The burden of malaria has decreased dramatically within the past several years in parts of sub-Saharan Africa. Further malaria control will require targeted control strategies based on evidence of risk. The objective of this study was to identify environmental risk factors for malaria transmission using remote sensing technologies to guide malaria control interventions in a region of declining burden of malaria. Methods Satellite images were used to construct a sampling frame for the random selection of households enrolled in prospective longitudinal and cross-sectional surveys of malaria parasitaemia in Southern Province, Zambia. A digital elevation model (DEM) was derived from the Shuttle Radar Topography Mission version 3 DEM and used for landscape characterization, including landforms, elevation, aspect, slope, topographic wetness, topographic position index and hydrological models of stream networks. Results A total of 768 individuals from 128 randomly selected households were enrolled over 21 months, from the end of the rainy season in April 2007 through December 2008. Of the 768 individuals tested, 117 (15.2%) were positive by malaria rapid diagnostic test (RDT). Individuals residing within 3.75 km of a third order stream were at increased risk of malaria. Households at elevations above the baseline elevation for the region were at decreasing risk of having RDT-positive residents. Households where new infections occurred were overlaid on a risk map of RDT positive households and incident infections were more likely to be located in high-risk areas derived from prevalence data. Based on the spatial risk map, targeting households in the top 80th percentile of malaria risk would require malaria control interventions directed to only 24% of the households. Conclusions Remote sensing technologies can be used to target malaria control interventions in a region of declining malaria transmission in southern Zambia, enabling a more efficient use of resources for malaria elimination.


Background
The burden of malaria has decreased dramatically in parts of sub-Saharan Africa [1]. While some of this decline is attributable to intensified control efforts, particularly the use of more effective anti-malarial drugs and the scale-up of insecticide-treated nets (ITNs) and indoor residual spraying (IRS), the situation is more complex [2]. In several sites the decline in malaria preceded widespread introduction of control strategies. In Kilifi, Kenya, for example, the decline in paediatric hospitalizations for malaria began a year before the large-scale distribution of ITNs and two years before the availability of artemisinincombination therapy (ACT) [3].
Declines in the burden of malaria have been observed in Zambia, coincident with the widespread implementation of malaria control strategies, including the use of ACT as the first-line treatment regimen, the distribution of long-lasting insecticidal nets (LLINs), and targeted IRS [4]. The overall prevalence of malaria parasitaemia in children younger than five years of age decreased 53% from a baseline prevalence of 22% between 2006 and 2008 [5]. In Choma District, southern Zambia, paediatric hospitalizations at Macha Hospital for malaria decreased from approximately 1,400 admissions per malaria season in 2000-2001 to fewer than 50 per year in each of the past three years. This decline began in 2004, shortly after the introduction of ACT, but before widespread distribution of LLINs.
The decline in the burden of malaria in parts of sub-Saharan Africa has led to interest in malaria elimination and eradication [6]. One strategy for regional elimination is a step-wise approach starting with countries at the margins of endemic transmission [7], including the southern margins of malaria transmission in Africa [8].
Within the countries of southern Africa there is a range of Plasmodium falciparum parasite rates, corresponding to different levels of endemicity and phases of control [9]. Understanding the local epidemiology of malaria in southern African is thus critical to assessing the feasibility of regional malaria elimination.
Targeting interventions to hotspots of malaria transmission results in more efficient and cost-effective accelerated malaria control efforts [10] but requires identification of individual, household, and environmental correlates of transmission. Remote sensing technologies were applied to determine if readily generated environmental data are of sufficient detail and validity to identify heterogeneity in spatial risk factors for malaria transmission in southern Zambia, where the burden of malaria has decreased and elimination may be feasible.

Study population
Satellite images were used to construct a sampling frame for the random selection of households enrolled in prospective longitudinal and cross-sectional surveys of malaria parasitaemia in the catchment area of Macha Hospital in Southern Province, Zambia ( Figure 1). Macha Hospital is approximately 70 km from the nearest town of Choma and the catchment area is populated by traditional villagers living in small, scattered homesteads. Anopheles arabiensis is the primary vector responsible for malaria transmission, which peaks during the rainy season from December through April [11]. The sampling frame for the random selection of households was constructed from a Quickbird™ satellite image obtained from DigitalGlobe Services, Inc. (Denver, Colorado). The image was imported into ArcGIS 9.2 (Environmental Systems Research Institute [ESRI], Redlands, California) and locations of households were identified and enumerated manually. Structures of appropriate size and shape were identified as potential residences, and consisted of one or more domestic structures where members of a family resided. Smaller structures, such as kraals, and larger structures, such as schools, were excluded. Selected households were allocated to one of two study cohorts: longitudinal and cross-sectional. Households in the longitudinal cohort were surveyed repeatedly approximately every two months and households in the cross-sectional cohort were surveyed once. A field team was provided with images and coordinates of the randomly selected households. After obtaining permission from the local chief and head of household, and individual written informed consent, a questionnaire was administered to each participant residing within the household and a blood sample was collected by finger prick. Rapid diagnostic tests (RDT; ICT Diagnostics, Cape Town, South Africa) were used to detect P. falciparum histidine-rich protein 2. This RDT was shown to detect 82% of test samples with wild-type P. falciparum at a concentration of 200 parasites/μL 98% of test samples with a concentration of 2,000 parasites/μL, with false positives in 0.6% of clean negative samples [12]. Individuals who were RDT positive were offered treatment with artemether-lumefantrine (Coartem ® ) by trained medical personnel. Households in which at least one individual tested positive by RDT were classified as a positive household. Positive and negative households were plotted as a data layer in ArcGIS 9.2.

Landscape characterization
A digital elevation model (DEM) for the area with 1 m horizontal resolution was derived from the Shuttle Radar Topography Mission (SRTM) version 3 DEM with 90 m pixels. Elevation values correspond to the reflective surface on the earth and represent soil surface, vegetation or man-made structures. The SRTM imagery was collected during a 2001 space shuttle mission using a multi-frequency, multi-polarization radar system. Each pixel represented a 30 m average elevation around each pixel's center. The relative horizontal accuracy was ±15 m (90% circular error) with a relative vertical accuracy of ±6 m (90% vertical error).

Topographic wetness
The digital elevation model was processed in Imagine 9.1 (Earth Resource Data Analysis System [ERDAS], Norcross, Georgia) and imported into ArcGIS 9.2. Imagery and point locations were geo-referenced to UTM zone 35S, WGS 1984. The ArcGIS extension Terrain Analysis Using Digital Elevation Models extension [13] was used to model water flow and calculate an index of topographic wetness (ITW). Data were smoothed to fill in isolated elevation pits or spikes typically representing errors or areas of internal drainage that interrupt estimates of water flow. Slope and flow directions were determined using the multiple direction algorithm (MDA) [14] and the flat area flow direction methods [15]. The MDA method used the steepest slope of triangular facets, allowing water to flow in any direction. The ITW is an indicator of potential moisture, assuming surface homogeneity for soil and vegetation, and is calculated using the ratio of upslope contributing area and local slope (the tangent of slope = tanβ). A pan-sharpened Quickbird imagery scene (DigitalGlobal, Longmont, Colorado) (2.5 m mulitspectral) collected on June 12, 2007 was used to evaluate the hydrological model. Drainages were categorized according to the Strahler stream order classification into first through fifth order streams [16,17], such that, for example, a second order stream is formed when two first order streams join.

Topographic position index
The topographic position index (TPI) was generated in ArcView 3.3 (ESRI, Redlands, California) with an extension by Jenness [18]. The TPI classifies the landscape by slope position (low, middle, high) and landform type (plain, valley, ridge), and represents the difference between the elevation at a point and the elevations of neighbourhood cells. TPI values near zero are typical of flat or mid-slope locations. High values signify areas, such as hill tops and ridges, while low values are indicative of valley floors. Because the TPI is scale dependent, local (500 m) and area-wide (2 km) scales were considered. The 500 m neighbourhood detected local valleys and hills while the 2 km neighbourhood identified larger scale features such a large U-shaped valleys, gently sloped hills and tops of plateaus.
The magnitude of the TPI and the area's slope were used to classify the slope position according to Weiss [19], based on the TPI score standard deviations and slope values. Slope position classes were valley, lower slope, flat slope, middle slope, upper slope and ridge. Ten landform classes (deep streams, shallow valleys and mid-slope drainage pathways, upland drainage areas, Ushaped valleys, plains, open slopes, upper slopes and mesas, local ridges and hills in large valleys, mid-slope of ridges and small hills in plains, and high ridges) were generated by comparing standardized TPI values (standardized TPI = [TPI -TPI mean]/[TPI standard deviation]) for TPI values at 500 m (TPI 500 ), 2 km (TPI 2,000 ) and slope [19].

Statistical analyses
Incident malaria infections were estimated using data from the longitudinal survey. An incident infection was defined as an individual with a positive RDT after a prior negative RDT, or an individual with two consecutive positive RDTs more than 30 days apart. The month of infection was taken as the mid-point between a negative and positive RDT test, or the midpoint between one month after the first positive RDT and the time of the subsequent positive RDT. Logistic regression was used to identify environmental conditions associated with the odds of a household having an individual with a positive RDT. Distances of the surveyed households from the five stream orders were aggregated into quartiles. Household elevation above the study area baseline (1,009 m) in 10 m increments and slope were included in the model as continuous variables. Aspect was coded so that households with an eastern or south-eastern exposure were coded as 1, those with western and northwestern exposure were coded -1 and all others as 0. Analyses were performed using Statistix version 8.0 (Analytical Software, Tallahassee Florida). The spatial structure of residual errors was examined using Global Moran's I to determine if there was a systematic departure from the assumption that errors were spatially independent (i.e. spatially autocorrelated), which would overinflate the significance of the environmental risk factors estimated by logistic regression.

Characteristics of the study population
There were 8,751 households identified from the Quickbird satellite image in the study area ( Figure 2). A total of 768 individuals from 128 randomly selected  Figure 1). Thirty-five households were enrolled in the longitudinal survey and were visited more than once. Initial RDT results from the 270 individuals residing within these 35 households were used to characterize the prevalence of malaria in these households. These data were combined with the 498 individuals in 93 households enrolled in the crosssectional survey.
The median age of study participants was 12. Landforms and aspect of sampled households were representative of households in the study area Landforms occupied by the surveyed households were similar to those not surveyed in the study area. Surveyed (n = 128) and non-surveyed (n = 8,623) households were found on plains (85% vs. 75%, respectively), midslope drainage pathways (6% vs. 7%), mesas (4% vs. 6%), ridges (3% vs. 5%), canyons (2% vs. 2%), U shaped valleys (0% vs. 5%) and open slopes (0% vs. 0.002%). Households in U-shaped valleys were not represented in the surveyed households, as were two landforms not present in the study area, open slopes and headwaters.
The aspect of the land described by the compass direction of the steepest slope for an individual pixel was used to determine the direction the sloping land faced (flat surfaces were assessed separately). Generally, western and northwestern facing slopes are warmer and drier than comparable eastern and south-eastern facing slopes in the southern hemisphere. The distribution of slope directions for all households within the study area was proportional to the slope directions for land in the study region, indicating residents did not preferentially select a particular aspect when constructing households (see Additional File 1). Land aspects of the surveyed households matched the distribution of the non-surveyed households, with the exception of slight under sampling of households on northwestern facing slopes and a slight excess of households facing north-east (see Additional File 1).

Lower elevation was associated with increased risk of malaria
The elevation of households in the study area ranged from 1,009 to 1,268 m above sea level. Households with RDT positive individuals were significantly lower in elevation on average (elevation mean + standard deviation: 1,079.1 + 28.92 m; range 1,014 to 1,143 m) than households with only RDT negative individuals (1,105.6 + 46.21 m; range 1,040 to 1,247 m) (P = 0.0001).

Lesser slope was associated with increased risk of malaria
Households with RDT positive individuals were approximately 50% more likely than RDT negative households to occupy eastern and south-eastern facing slopes (see Additional File 1). Conversely, RDT negative households were twice as likely to occupy north-western and western-facing slopes compared to RDT positive households. The slope of the land, as measured in degrees, characterizes the rate at which the elevation of the local area changes. The study area was generally flat, with the average slope surrounding households 0.026°± 0.019°. On average, RDT positive households were on flatter ground (slope of 0.024°± 0.014°) than RDT negative households (slope of 0.031°± 0.023°; p = 0.04).

Proximity to third order streams was associated with increased risk of malaria
The risk of malaria associated with living increasingly closer to each of the five stream orders was assessed using logistic regression, with the furthest distance used as the reference category. Households with at least one RDT positive individual were compared with households in which no individuals were RDT positive. Only distances from third and fifth order streams were associated with the odds of a household having an RDT positive individual. The risk decreased the closer a household was to the only fifth order stream in the region. Households within 4.5 km of a fifth order stream were 2.6 (95% CI 1.1-6.2) times less likely to be positive than households greater than 20.6 km from a fifth order stream. In contrast, living in proximity to a third order stream increased the risk of RDT positivity. Households within 1.98 km of a third order stream were 2.8 (95% CI 1.2-6.9) times more likely to have an RDT positive resident than households situated more than 6.0 km from a third order stream.
The multivariable logistic model identified the nearest quartile of distance to third order streams, and between the first and median quartiles of distances to third order streams, as significant predictors of the odds of a household having an RDT positive individual (Table 1). There was an increased risk of malaria for persons living within 3.75 km of a third order stream. Households at elevations above the baseline elevation for the region (1009 m) were at decreasing risk of having an RDT positive individual. For every 10 m rise in elevation the risk decreased by approximately 13%. No other environmental characteristics were significantly associated with the odds of a household having an RDT positive individual in the multivariable analyses. The residual errors did not show significant spatial autocorrelation (I = -0.07), so alternative model structures were not incorporated. Generalized linear mixed models were evaluated but they did not improve the fit of the model and were not considered further.

Spatial risk map for malaria
A spatial risk map for the study area was generated based on the regression analyses ( Figure 3). Areas of higher risk for malaria increased in the north of the study region and in proximity to third order streams. However, there was a large region of low risk between the northern region and the two fourth order streams running south.

Validation of the risk map using incidence cases
An estimate of incident malaria cases from the longitudinal surveys was used to evaluate the predictive ability of the risk map based on prevalence data and to locate areas where new malaria cases were identified.    were more likely to be located in high-risk areas based on the prevalence data. The incidence of an RDT positive individual within a household showed a rapid rise as the value of the logistic function exceeded 0.52 (Figure 5). In households where the function exceeded 0.60, household incidence of an RDT positive individual averaged 10%/month compared with 1.9%/month for households with lower risk ( Figure 5).
Using the spatial risk map to target malaria control interventions To explore the consequences of identifying households for targeted interventions based on the risk map, the proportion of households above different thresholds in predicted risk of having an RDT positive individual was determined and these estimates were extrapolated to all the households in the region. To identify 50%, 80% and 90% of the households with RDT positive individuals, logit estimates of p > 0.53, 0.31 and 0.26, respectively, were used from the empirical distribution of predicted values. Among the surveyed households, this resulted in fewer households with RDT positive individuals being excluded (27, 11 and 5 households with at least one RDT positive individual being missed, respectively). As a measure of specificity, these thresholds also included 43, 21, and 17 households with no RDT positive individuals, so that 62%, 66% and 77% of households below these thresholds would have no RDT positive individuals. In addition to the presence of individuals with RDT positive results in the households, there also was a substantial difference in the frequency of infection as measured by the proportion of tests performed that gave positive results. For example, when the threshold was selected to capture 90% of the households with any RDT positive results, only five of the 338 (1.5%) RDT tests performed in households below the threshold were positive. By comparison, in households above the threshold there were 40 households with at least one RDT positive individual and there was a substantially higher proportion of positive RDTs among individuals residing in these households (114/579 = 19.7% tests).
To target malaria control interventions to all households at or above the 90 th percentile of predicted risk would require that 38.8% (3,398) of the 8,751 households in the region receive the intervention. Households in the top 80 th percentile would require targeting of malaria control interventions to 23.6% (2,067) of the households, and the top 50 th percentile required targeting interventions to 20.3% (1,775) of the households.

Discussion
Remote sensing enabled identification of environmental risk factors for P. falciparum parasitaemia in a region of declining malaria transmission in southern Zambia and the generation of a spatial risk map predictive of incident malaria cases. Proximity to third order streams and low altitude were environmental characteristics associated with households in which a resident had parasitaemia, presumably because of proximity to anopheline breeding sites in these ecological settings. A strength of this methodology is that parasite prevalence was measured in symptomatic and asymptomatic individuals residing in randomly selected households using satellite images and was not subject to biases inherent in passive detection at health care facilities or limited to symptomatic individuals. Furthermore, the landforms of surveyed households were representative of all households in the study area. This study is one of the first to identify environmental risk factors for malaria in a region of declining malaria transmission and accelerated control efforts. Using data readily generated with remote sensing technologies, this approach can be used to guide targeted malaria control interventions to further reduce malaria transmission.
Remote sensing has been used to identify risk factors for malaria in regions of high malaria transmission [20]. Commonly identified risk factors include land elevation [21] and proximity to vector breeding sites. In a region of high malaria endemicity in Ghana, small-scale heterogeneity in risk was identified between villages, with increased risk in those households closer to the forest fringe [22]. Prior studies have identified proximity to bodies of water as a risk factor for malaria, although none identified a specific stream order. In a region of low malaria endemicity in northern Tanzania, living close to the river was an independent predictor for malaria infection based on passive case detection at a health care facility, and the authors suggest that interventions should be targeted to households close to the river [23]. Hydrologic modelling was used to assess the risk of malaria in the highlands of western Kenya [24]. Topography-derived wetness indices were Figure 5 Household incidence of malaria compared with log odds of residing in a high risk area. significantly associated with household-level malaria incidence, independent of elevation. Specifically, households with cases of malaria were located 280 m closer to regions with high wetness indices than control households. However, in a highly malaria endemic area in western Côte d'Ivoire, proximity to a river was not significantly associated with the risk of malaria after adjusting for spatial correlation [25].
A malaria risk map was generated based upon the environmental risk factors that allowed us to extrapolate malaria risk throughout the study area. This map was based on parasite prevalence data from randomly selected households and validated with incidence data, confirming the biological significance of the identified risk factors. Other, potentially more biased, sampling methods have been used to generate malaria risk maps. For example, local clustering was identified and risk maps generated from malaria morbidity data in East Shoa, Ethiopia and digital elevation models [26], and risk models have been generated on smaller scales [27]. Generating risk maps from modelling approaches such as logistic regression makes strong assumptions about the form of the residual variation, especially that the errors are independently distributed rather than retaining spatial structure. Thus, testing for spatial structure in the residual variation is needed to ensure that the significance of the point estimates of the environmental risk factors is not overestimated, and that at the scale of the analysis remaining, undetected cofactors have not been ignored.

Conclusions
For national planning of malaria control strategies, largescale maps are needed. Validated, remote sensing technologies allow generation of large-scale risk maps for targeting of malaria control interventions. Using readily available ecological data and remote sensing technologies, a malaria risk map was generated for targeted interventions in a region of decreasing malaria transmission in southern Zambia. This analysis suggests that increasing the targeted area from 50% to 80% of high-risk communities requires reaching a much smaller proportion of households than increasing the target area from 80% to 90% of high-risk households. Although different settings may have a different set of predictors, such information can guide allocation of limited resources to achieve further malaria control in regions of declining malaria transmission.