The use of spatial and genetic tools to assess Plasmodium falciparum transmission in Lusaka, Zambia between 2011 and 2015

Background Zambia has set itself the ambitious target of eliminating malaria by 2021. To continue tracking transmission to zero, new interventions, tools and approaches are required. Methods Urban reactive case detection (RCD) was performed in Lusaka city from 2011 to 2015 to better understand the location and drivers of malaria transmission. Briefly, index cases were followed to their home and all consenting individuals living in the index house and nine proximal houses were tested with a malaria rapid diagnostic test and treated if positive. A brief survey was performed and for certain responses, a dried blood spot sample collected for genetic analysis. Aggregate health facility data, individual RCD response data and genetic results were analysed spatially and against environmental correlates. Results Total number of malaria cases remained relatively constant, while the average age of incident cases and the proportion of incident cases reporting recent travel both increased. The estimated R0 in Lusaka was < 1 throughout the study period. RCD responses performed within 250 m of uninhabited/vacant land were associated with a higher probability of identifying additional infections. Conclusions Evidence suggests that the majority of malaria infections are imported from outside Lusaka. However there remains some level of local transmission occurring on the periphery of urban settlements, namely in the wet season. Unfortunately, due to the higher-than-expected complexity of infections and the small number of samples tested, genetic analysis was unable to identify any meaningful trends in the data.


Background
Transmitted by mosquitoes of the Anopheles genus, malaria killed an estimated one million people in the year 2000, the vast majority of whom lived in sub-Saharan Africa. Scale-up of insecticide-treated mosquito nets (ITN) to prevent malaria transmission and effective artemisinin-based combination therapy (ACT) to treat malaria among those infected have greatly reduced the burden of malaria in sub-Saharan Africa [1]. Inspired by the progress that these interventions have made, attention is turning away from just controlling malaria disease and reducing deaths to eliminating transmission of the parasite. Zambia is one country that has made great strides in reducing malaria through implementing proven interventions [2], and has now set itself the ambitious goal of eliminating malaria by 2021 [3]. To realize this goal, elimination will have to be achieved in all environmental settings.
Malaria transmission is most intense in rural areas [4,5], which provide preferred habitat for the Anopheles mosquitoes that tend to lay eggs in clean water. While rural settings contribute the majority of transmission events, urban malaria persists and unless understood presents a threat to elimination [4,6]. Many malariaendemic regions are becoming more urban [7]. Indeed Zambia's urban population increased from 35% of the total in 2000 to 40% in 2010 [8].
In Zambia, like many developing countries, urban growth is fastest in unplanned settlements that are prone to flooding [9], and hence may be at increased risk for malaria transmission. Unplanned peri-urban areas lack sewers or drainage systems, which allows for pooling of water to provide breeding sites for malaria vectors and increase the risk of malaria transmission [10][11][12][13][14][15]. Furthermore, unplanned settlements are typically built upon less desirable land, and often in proximity to existing natural breeding sites such as swamps or other hydrological networks associated with increased malaria risk [11,13,[16][17][18][19]. Finally, agricultural activities, with their associated irrigation systems, can provide breeding sites and, therefore, increase malaria transmission in urban and peri-urban areas [19][20][21][22][23]. The spatial heterogeneity of these factors in urban areas leads to huge variation in the entomological inoculation rate (EIR) both across and within urban cities throughout malaria endemic regions. Keiser et al. found the EIR varied from 0 to 54 infective bites per person per year in urban areas throughout sub-Saharan Africa [24], although this range was estimated before the escalation of ITNs and ACT across the continent [25].
In Lusaka, the majority of malaria cases are associated with travel outside the city [26]. Because of the lower inherent transmission capacity for urban malaria, travel outside of urban areas to areas of higher malaria transmission is a primary risk factor for a case [27][28][29][30]. The risk that a traveller who acquires a malaria infection poses to neighbors upon the traveller's return, i.e. the risk of onward transmission, depends upon the transmission capacity of the traveller's return site [31], and will vary based upon the characteristics described previously.
In most settings it is possible to measure and then track changes in malaria transmission dynamics through cross-sectional parasite prevalence surveys, longitudinal routine health system metrics, or longitudinal entomological surveillance [32]. However, as malaria approaches elimination, the ability to define transmission with statistical confidence requires sample sizes that are often unachievable due to the rarity of malaria infections. While the rarity of an event cannot be changed, in this case the presence of a parasite in a person, the ability to sensitively and specifically detect and the amount of information derived from those events can be augmented through molecular tools. For example, PCR can be used to identify infections that are below the level of detection of rapid diagnostic tests (RDT) or microscopy. Furthermore, genetic analysis can determine both the complexity of infection i.e. the number of malaria co-infections, and/or the genetic haplotype of the infections. These genetic analyses have been used to show changes in transmission [33], as well as clonal expansion during an epidemic [34]. Where possible, molecular work, as reported here, should be performed locally [35] to ensure data is understood and applied to decision-making.
This manuscript examines the central question of whether there is ongoing malaria transmission within urban Lusaka, Zambia. Further, the ability to combine molecular and spatial tools to first identify whether transmission is occurring, and second to identify areas where transmission is more likely to be occurring, was examined.

Study site
Lusaka is the capital of Zambia and the prime economic hub of the country. The city lies at ~ 1300 m above sea level, and although it does not snow, temperatures in the cold season can drop to below 10 °C. Like many major cities in lower-income countries, Lusaka is made up of a combination of planned and unplanned settlements, and its workforce includes both formal and informal occupations. No malaria infections have been found within Lusaka in any of the Malaria Indicator Surveys since 2006, yet cases continue to be found through the public health system (Fig. 1). The reported primary malaria vectors in Lusaka are Anopheles gambiae sensu stricto (s.s.) and Anopheles arabiensis [36], however entomological surveillance has been challenging due to the very low vector numbers. Malaria testing and treatment is free to all individuals in the public health centres, which have improved their malaria case management dramatically in recent years [26]. A total of 27 health facilities within the city of Lusaka and under the management of the Lusaka District Health Management Team (80% of total health facilities) were included in this study (Fig. 2). Ten of these facilities were included throughout the 2011-2015 period, while 17 additional facilities were enrolled only during the final year. Facilities without an environmental health technician (EHT) were not included as the EHT co-ordinated the RCD responses.

Reactive case detection
With funding from the Presidents Malaria Initiative (PMI), reactive case detection (RCD) commenced in 10 health facilities in Lusaka in 2011 with the goal of improving understanding of malaria transmission within the city and finding problematic transmission hotspots [37]. Prior to this date, no community follow-ups had been made for any health facility (HF) incident case.
RCD has been described in detail elsewhere [37]. In brief, a team follows up a confirmed index case and tests the index case household and 9 closest neighbouring houses for malaria infection, treating those who test positive.
While almost all government health facilities in Lusaka participated at some point in intensified surveillance activities, human and financial resources were insufficient to follow up all HF incident cases, particularly in the  wet season. For those index cases followed-up, a range of data was collected for both index cases and RCD participants, including demographic details, symptoms, travel, and malaria infection history in the last month (see Additional files 1 and 2). Household level data, including GPS location and history of IRS in the last 12 months, were also collected. In 2014, a grant from the Malaria Eradication Scientific Alliance (MESA), allowed RCD operations to be intensified and expanded to a total of 27 health facilities in the city. The expansion ensured more case investigations were performed and the collection of dried blood spots (DBS) from consenting individuals was added to the information previously collected.

Data sources
Herein the RCD database, which was collected from 10 HFs in 2011-2014, and 27 HFs in 2014-2015 was utilized. The database contains more information than the standard health management information system in Zambia, with numerous characteristics about each incident malaria case including age, travel history, sex, and whether the case was followed up through RCD. The database also includes information of RCD responses, with individual-level information for all RCD households and participants.

Descriptive analysis
From the RCD data, trends were analysed as follows. First, the proportion of incident malaria cases that reported travel outside Lusaka district in the previous 1 month was determined. Assuming that at least some portion of incident cases reporting travel outside Lusaka were imported, the formula from Churcher et al. was used to estimate a crude reproductive number (R 0 ) for the city [38]. The tolerance of the importation and travel assumption was tested by running simulations of different levels of cases reporting travel outside Lusaka being considered as imported. Second, the age of incident malaria cases over time was determined, as the mean age of incident malaria in a population is an indicator of the intensity of malaria transmission [39][40][41].

Environmental analysis
The topographical position index (TPI) and topographical wetness index (TWI) are associated with increased risk of malaria vector breeding sites and in some cases increased risk of malaria transmission [42][43][44][45]. Both indices are derived from a digital elevation model. Google Earth Engine was used to calculate TWI as well as TPI at resolutions of 300 m and 2000 m. Additionally, enhanced vegetation index at a spatial scale of 250 m and monthly temporal scale were retrieved. Using Quantum GIS version 2.0.1 and the OpenLayers plugin, uninhabited areas, defined as an area without a rooftop, were traced from satellite imagery. The Euclidean distance from the geocoordinates of the index case household to the nearest uninhabited area in increments of 50 ms, i.e. 0-50 m, 50-100 m, were then measured. These environmental measures were matched to the geo-coordinates of RCD participant households in 2014 and 2015 using the Raster package [46,47], in R version 3.3.1 [48].

Regression model analysis
Two separate outcomes with regards to the RCD data was examined. First, the probability of testing positive for a P. falciparum infection during RCD with RCD participants as the unit of analysis was assessed, and second the probability of finding a P. falciparum infection during RCD with each RCD investigation as the unity of analysis.
Factors associated with testing positive for a P. falciparum infection during RCD were examined as follows. A priori hypotheses suggested that travel outside of Lusaka, season, person's age, person's gender, and location of the household (person living in the index case household or not) could be associated with having a P. falciparum infection. A logistic regression approach with the index case included as a random intercept in the model was utilized. The general model used to assess the relationship between testing positive for a P. falciparum infection during RCD and the hypothesized factors is given by the following equations: where π ijk is a dichotomous outcome for person i in household j participating in the RCD for index case k; Travel i is whether that person travelled outside Lusaka in the previous 2 weeks or not; Season k is whether the RCD was conducted during the high transmission season or not; Age i is the age of the person categorized as < 5, 5-15, and > 15 years of age; Sex i is whether the person is male or female; Location j is whether the person lives in the index case household or not; and χ k is a random intercept for RCD activities associated with index case k that is assumed to be normally distributed with a mean of zero.
Factors associated with finding an RDT-positive individual during RCD were examined as follows. A priori hypotheses suggested that travel outside of Lusaka, season, distance from uninhabited areas, age, and sex (gender) of the incident malaria case could be associated with finding more positives during RCD. In the wet season distance from uninhabited area, travel, season, age and gender were all retrieved from the surveillance database. The number of positives found during an RCD response were skewed right and overdispersed, so a negative-binomial regression was used to determine the association with the hypothesized factors. The general model used to assess the relationship between testing positive for a P. falciparum infection during RCD and the hypothesized factors is given by the following equations: where μ I is the number of P. falciparum infections found during RCD of index case i; t i is the number of people tested during RCD of index case i; Location i is whether the household was located within 250 metres of an uninhabited area of Lusaka or not; Season i is whether the RCD was conducted during the high or low malaria transmission season; Age i is the age of the index case categorized as < 5, 5-15, or > 15 years; Sex i is whether the index case was male or female; and Travel i is whether or not the index case travelled outside Lusaka in the previous 2 weeks.
All regression analyses were conducted in Stata version 13.1.

Genetic analysis
From 2014, a further aim of collecting a DBS from every index case and all RCD participants was added to the protocol. Unfortunately, challenges in the field meant that some DBS were not collected, were incorrectly labelled, incorrectly stored or lost during transit to the laboratory. A subset of RCD responses were selected for molecular analysis based on the completeness of the sample record, i.e. only those responses with a complete or near-complete (> 85%) DBS sample repository were analysed.
DNA was extracted from RDT negative DBS either individually or in pools of 10 using a QIAamp (Qiagen) mini-spin column or DNA IQ system (Promega) as per manufacturer's instructions, and amplified using photoinduced electron transfer PCR (PET-PCR) as previously described [49]. Positive pools were deconvoluted to identify individual positives. PCR/RDT positives were then genotyped/barcoded using the Taqman assay as described elsewhere [50].
Barcoded samples with ≥ 11 missing loci (out of 24), were classified as incomplete and removed from any further analysis. Infections were classified as polyclonal using a cutoff of ≥ 4 loci with a mixed infection call [33,50,51]. Genetic relatedness was calculated using a modified SNP π [52], which accounts for samples with missing µ i = e ln(t i )+β 1 Location i ×β 2 Season i +β 3 Age i +β 4 Sex i +β 5 Travel i data and mixed infections [52], and visualized using a neighbour joining phylogenetic tree.
The complexity of infection (COI) was determined for complete barcoded samples using the COI Likelihood (COIL) calculator developed by Galinsky et al. [53]. In brief, COIL uses Bayesian methodology to estimate the probable number of infections that are present within a single sample, based on the number of isolated pairs.

Trends in incident malaria cases 2011-2015
Between 2011 and 2015, 14,966 confirmed incident malaria cases were reported for all health facilities within Lusaka district, of which 8723 confirmed cases were reported from health facilities which were participating in this study at the time (10 HFs in 2011-2014, 27 HFs in 2014-2015). Among these confirmed incident malaria cases the majority reported travel outside Lusaka in the previous 2 weeks (Fig. 3, Table 1), and the median age of incident malaria cases steadily increased from 8.9 years of age in 2011 (interquartile range = 2.7-26.7) to 16.1 years of age in 2015 (interquartile range = 6.1-28.2) (Fig. 4). Assuming that at least 40% of incident cases reporting travel outside Lusaka are imported cases, Lusaka has an estimated R 0 < 1 since 2011, with a decrease in 2014-2015 compared to 2011-2013 (Fig. 5).

Factors associated with finding additional positives during case investigations
From a total of 8723 confirmed incident malaria cases, 428 (4.9%) index malaria cases were investigated during RCD, enrolling 11,954 RCD participants (community members tested by the RCD system), and 206 RDT-positive malaria infections found (RCD incident cases). Test positivity during RCD was typically lower than 5% (mean 1.71% ± SD 1.65%), with higher test positivity during the high transmission season compared to the low transmission season (Fig. 6).
Among the RCD participants, a number of factors were associated with increased odds of individuals testing positive including seasonality and distance to uninhabited areas, travel outside of Lusaka in the past month, age (children aged [5][6][7][8][9][10][11][12][13][14][15], and living in the same household as the index case (Table 2). Additionally, the proximity of the index case household to uninhabited areas of Lusaka during the high transmission season was associated with an increased probability of finding malaria infections during RCD (Table 3). However, no association between RDT-positive RCD participants and any index case demographics, including travel history (Table 3) was detected. In addition, no association was found between finding a positive during RCD and any of the remotely sensed environmental factors that were examined, specifically TPI at scales of 300 m and 2000 m, TWI, mean EVI, monthly EVI, minimum EVI, maximum EVI and altitude.

Genetic analysis
A subset of samples from 65 reactive case detection responses comprising 645 people (65 index cases and 580 household members) from 204 houses were further assessed by genetic analysis. The RDT positivity during the responses was 0.47% (3/580), while 446 individuals had DBS collected (59 index cases and 387 household members). PCR analysis identified two false positives (1 index case and 1 household member), and 4 false negative RDTs (all household members). The latter increased the positivity rate, to 1.55% (6/387), an approximate threefold increase.

Barcoding
Positive molecular barcode data was generated from 72 individuals, with a range of completeness. Of these, 22 samples had ≥ 11 missing loci (out of 24), were classified as incomplete and removed from any further analysis. The remaining 50 individuals had a median age of 17 years and had a high proportion of travel (67%) with a median travel time of 8.5 days. The proportion of polyclonal infections was moderate (28%) and genetic relatedness was high (72%) (Additional file 3: Table S1). Phylogenetic analysis did not show any evidence of clustering of genetic structure between individuals with or without travel history. Individuals with a travel history were slightly older (18 vs. 10 years old, p = 0.5), more likely to be an index case (97% vs. 87%, p < 0.05), have more polyclonal infections (38% vs. 13%, p = 0.2), more febrile (82% vs. 73%, p = 0.2), and have slightly less related infections (74% vs. 75%, p = 0.7) in comparison to individuals who did not travel (Additional file 3: Table S1, Figs. 7 and 8). Individuals with a travel history had a slightly higher genetic relatedness compared to the overall genetic relatedness of individuals without a travel history (Fig. 8).

COIL analysis
COIL analysis predicted that 54 of 71 (76%) genotyped malaria samples were from single infections, however 19 of these predicted single infections had posterior probabilities < 0.80 due to missing SNPs. Of the 50 genotyped samples with high statistical certainty, 32 were single infections (64%), 12 were dual infections (24%) and six had three or more infections (12%).

Discussion
In this study, information from incident case information, particularly spatial location, and later with genetic analyses of malaria infections was used to spatially review the extent and location of malaria transmission in the city of Lusaka, Zambia, an area approaching malaria elimination. The rationale for such an approach was to enhance the signal to noise ratio by swapping a binary uninfected/ infected output to an exponentially richer haplotype/ multiplicity of infection (MOI) output. Combined with the spatial data from reactive case detection, it was hoped that relationships between individual infections, through haplotype matching/relatedness, could be identified as well as estimates of transmission determined from the MOI. However, the study was not powered to measure a specific deviation, but rather aimed to describe the parasite population from a molecular point of view.

Transmission trends
While the total number of incident malaria cases was relatively constant, evidence was found to support the conclusion that transmission decreased over the study period from the standard measures of transmission available in the HMIS. This included an increase in the median age of incident cases, as well as a declining R 0 of < 1, over time.
Interestingly, the probability of finding infections during RCD was increased if the index malaria case lived on the periphery of human settlement and the index case was reported during the wet season, but not the dry season. Given these results from RCD, it appears that there may be malaria transmission which is not associated with travel occurring along the periphery of human settlements during the wet season. A key challenge for RCD is determining the appropriate range for a response. In this study, living in the index house was associated with testing positive (adjusted IRR 4.03, Table 2), however a large number of positives were identified in non-index houses. Without additional information on the relationship between, or source of these different infections, it is unclear whether the radius used in this study, nine closest neighboring houses, is sufficient or overkill. Unfortunately, it was not possible to demonstrate any associations between remotely sensed topographical information and the probability of finding a malaria infection during RCD, other than distance from uninhabited areas. RDT-diagnosed malaria infections were used as the primary outcome in these analyses, which could have added noise through false positives and false negatives. While this noise is likely to be present, it is  unlikely to entirely account for the null results observed. More research on urban malaria transmission dynamics is needed, particularly around risk mapping in urban environments. Distance to uninhabited areas is a known factor that increases the probability of being at risk of malaria transmission [54,55], but more research is needed to identify specific characteristics of areas that make them more probable to continue onward malaria transmission so that they can be either modified or used to target for malaria control.
As Lusaka aims to become free of malaria transmission, increased mosquito control in the periphery may be of benefit, and linking malaria surveillance to vector control microplanning processes is important. The results found herein suggest that there is vectoral capacity in the periphery to facilitate malaria transmission if malaria parasites are present or imported. Malaria control programmes aiming for elimination may be more successful when focusing on locations of human settlements where sufficient habitat enables malaria transmission [56].
Although of benefit in describing malaria trends [57], human movement is more challenging to address as a driver of malaria transmission and in this study was not

Genetics
In contrast to the evidence derived from HMIS and RCD data, it was much harder to draw a clear conclusion from the spatial genetics data. Firstly, in this study, MOI (as determined by COIL) was higher than expected, even in individuals with no history of travel. While MOI is thought to correlate directly with transmission, where a large proportion of the infections are imported this relationship may be skewed. For example, if, as suggested here, local transmission represents a relatively small, but persistent fraction of the source of infections, the high importation of diverse polymorphic infections likely sustains a high MOI for any locally transmitted cases. Where local transmission chains are very short, as supported by estimates of R reported here, this artificially high MOI would be more pronounced. Interestingly, the proportion of polygenomic infections correlated with travel ( Fig. 7) suggesting that MOI decreases with local transmission. The ability to utilize SNP-barcode methods to successfully identify individual haplotypes decreases with increasing levels of MOI, making matching/determining relatedness between individual infections much harder. Indeed, it is possible that identical haplotypes were not identified as they were masked by other haplotypes in individuals with more than one parasite present. When designing the study it was hoped that genetic relatedness correlated across space at a local spatial scale (in terms of metres) for individuals without any recent travel history  Overall genetic relatedness of parasites found during RCD, comparing RCD participants with and without a travel history [58]. However, due to the high MOI and high importation, with the majority of infections likely acquired across a large area of Zambia, it was not possible to perform genetic spatial analyses with the SNP-barcode methods. Future work aiming to examine spatial genetics as potential tools for assessing the epidemiology of malaria parasites should consider genetic barcoding methods such as amplicon deep sequencing which may allow analysis of polyclonal samples [59]. Otherwise, researchers should seek areas of low transmission where imported cases are relatively few. It is reasonable to expect the ability to differentiate imported from locally-acquired infections to increase in resolution as the repository of Zambian barcodes grows. Future analyses of these data when equipped with a better understanding of the identity and spatial distribution of Zambian parasite populations may yield clearer results.

Conclusions
Results suggest there may be two separate malaria transmission phenomena occurring simultaneously in Lusaka: low-level transmission circulating in the periphery as well as a high number of imported malaria cases. The vast majority (> 90%) of malaria cases is likely a result of travel outside Lusaka, however there appears to be persistent unrelated malaria transmission on the periphery of the city. Spatial analyses can be combined with genetic analyses to investigate infectious diseases, but may be limited in their findings due to the rarity of the infection and are further complicated by infections with a multiplicity greater than one. The macro-level tools of median age of malaria cases and Churcher's formula are useful in describing the former, however they appear less useful in describing the latter. As Zambia continues its path towards malaria elimination, further fine-scale surveillance data must be collected to better understand urban and peri-urban transmission dynamics and to plan, coordinate and monitor malaria interventions.