Multilocus haplotypes reveal variable levels of diversity and population structure of Plasmodium falciparum in Papua New Guinea, a region of intense perennial transmission

Background The South West Pacific nation of Papua New Guinea has intense year round transmission of Plasmodium falciparum on the coast and in the low-lying inland areas. Local heterogeneity in the epidemiology of malaria suggests that parasites from multiple locations will need to be surveyed to define the population biology of P. falciparum in the region. This study describes the population genetics of P. falciparum in thirteen villages spread over four distinct catchment areas of Papua New Guinea. Methods Ten microsatellite loci were genotyped in 318 P. falciparum isolates from the parasite populations of two inland catchment areas, namely Wosera (number of villages (n) = 7) and Utu (n = 1) and; and two coastal catchments, Malala (n = 3) and Mugil (n = 3). Analysis of the resultant multilocus haplotypes was done at different spatial scales (2-336 km) to define the genetic diversity (allelic richness and expected heterozygosity), linkage disequilibrium and population structure throughout the study area. Results Although genetic diversity was high in all parasite populations, it was also variable with a lower allelic richness and expected heterozygosity for inland populations compared to those from the more accessible coast. This variability was not correlated with two proxy measures of transmission intensity, the infection prevalence and the proportion multiple infections. Random associations among the microsatellite loci were observed in all four catchments showing that a substantial degree of out-crossing occurs in the region. Moderate to very high levels of population structure were found but the amount of genetic differentiation (FST) did not correlate with geographic distance suggesting that parasite populations are fragmented. Population structure was also identified between villages within the Malala area, with the haplotypes of one parasite population clustering with the neighbouring catchment of Mugil. Conclusion The observed population genetics of P. falciparum in this region is likely to be a consequence of the high transmission intensity combined with the isolation of human and vector populations, especially those located inland and migration of parasites via human movement into coastal populations. The variable genetic diversity and population structure of P. falciparum has important implications for malaria control strategies and warrants further fine scale sampling throughout Papua New Guinea.


Background
Malaria arising from infection with Plasmodium falciparum is a major cause of morbidity and mortality in tropical and sub-tropical regions of the world [1]. The difficulty in controlling this devastating disease has been due in part to high levels of genetic diversity of P. falciparum, allowing the rapid evolution and dissemination of advantageous traits such as drug resistance and antigenic variability. Malaria control would be more effective if the target parasite populations could be surveyed before an intervention to determine the extent of (i) genetic diversity, as a predictor of the populations' resilience to interventions; (ii) linkage disequilibrium, to understand the potential for multilocus haplotypes to spread through the region; and (iii) population structure, to map the distribution of diversity over geographic space and thus infer patterns of parasite migration. Population genetic surveys are therefore an essential preliminary step in designing the most appropriate and effective malaria control measures and as a baseline upon which to monitor their impact.
The worldwide population genetic structure of P. falciparum, as defined by multilocus genotyping, shows a general pattern of increasing genetic diversity, but decreasing linkage disequilibrium (LD) and population differentiation in association with the parasite transmission intensity (Americas < Asia Pacific <Africa) [2,3]. This original concept is being constantly updated, with new studies being used to describe the various patterns found locally within each continent. In the Americas, parasite populations continue to be characterized by low diversity, high levels of LD and strong population structure independent of geographic distance [4]. In Asia, more extensive studies have demonstrated higher levels of diversity than previously recognized and a lack of LD [5]. Moderate to high levels of population structure among countries of mainland Asia [6], and locations within Malaysia (Sumatra) and the Philippine islands [7,8] have been reported. In Africa, population structure, low levels of genetic diversity and significant LD have been described in the urban populations of Senegal, Niger and the Republic of Djibouti [9]. Significant LD has also been found in regions of high transmission and diversity in Senegal and the Republic of Congo [10,11]. The variable results observed are likely the result of inherent features to each geographic region such as the genetics and movement of human and anopheline hosts and biogeographical features that may interrupt gene flow, as well as the history of malaria transmission [2,12] and local malaria control efforts [7]. This emphasizes the importance of investigating the parasite population genetics within each region of interest, particularly now malaria elimination is back on the agenda in many countries.
The epidemiology of malaria in the South West Pacific nation of Papua New Guinea is highly variable. Malaria transmission is confined to the coastal and lowland zones where intense perennial transmission of the four major human malaria species (P. falciparum, Plasmodium vivax, Plasmodium malariae and Plasmodium ovale) occurs. Whereas, in the highlands, Plasmodium spp. infection is present but is mostly due to sporadic epidemic transmission [13]. In the endemic regions, the degree of P. falciparum transmission is high but variable among different regions, villages and even clusters of houses within villages [14], with entomological inoculation rates ranging from 0.15 -1.44 infective bites/person/night [15][16][17][18]. Accordingly, the prevalence of multiple infections also varies greatly within this region, ranging from 26 to 50% [2,[19][20][21], thus creating a range of opportunities for recombination between different genomes and the further generation of genetic diversity. This diverse micro-epidemiology has been attributed to differing patterns of host behaviour [22], nutrition [23][24][25][26], mosquito control [14,27], the wide range of vectors present [16,27,28] and more recently, bed net usage [20]. Variable patterns of parasite genetic diversity may result from, or underlie this mosaic pattern of malaria epidemiology. In addition, the biogeography of the country including mountains, thick forests and large rivers with limited transport have resulted in the isolation of human populations, as evidenced by the existence of more than 800 local languages [29] and different frequencies of human genetic polymorphisms that protect against malaria in different provinces [30]. The human diversity is matched by a highly diverse vector fauna with at least seven members of the Anopheles punctulatus complex and several minor species contributing to the transmission of malaria [13]. These vector species differ both in geographical distribution [27,31,32] and biting behaviour [33] and the major mosquito vector in the country, Anopheles farauti s.l. is known to fly only short distances (< 2 km). For these reasons, it is possible that gene flow among different parasite subpopulations of Papua New Guinea is limited. The presence of local parasite population structure has been suggested by differing seroprevalence to a P. falciparum antigen (S-antigen), even among closely spaced villages [34]. Consequently, it will be essential to sample more than one parasite population to explore the population structure of P. falciparum in Papua New Guinea. To date, there has only been one other study investigating the population biology of P. falciparum in Papua New Guinea by multilocus genotyping [2]. Using putatively neutral microsatellite markers, this study reported high levels of genetic diversity, a lack of LD and limited genetic differentiation between two neighbouring villages on the coast of Madang Province [2].
Given the locally variable epidemiology of malaria in Papua New Guinea [14], knowledge of the population genetics at different spatial scales is essential for decision-making for targeting P. falciparum populations for malaria control, managing the spread of drug resistance, developing approaches for vaccine design and eventually, elimination programs. The aim of this study was to describe the population genetics of P. falciparum sampled from thirteen villages spread over four distant catchments of East Sepik and Madang Provinces. Multilocus haplotypes were defined by genotyping a validated panel of microsatellite markers [35] and the extent and distribution of genetic diversity within and among parasite populations was measured. The results have important implications for programs targeted at controlling and eliminating this major human pathogen.

Methods
Study sites and P. falciparum isolates East Sepik and Madang Provinces have long been the focus of malaria research and ongoing control efforts in Papua New Guinea. To represent a broad cross-section of the targeted parasite populations and to limit bias in the dataset, venous blood samples were collected from asymptomatic human volunteers of all ages in cross-sectional malaria surveys. In East Sepik Province, the survey took part in August and September of 2005 with 872 samples collected from individuals residing in one catchment area including seven villages spaced between 2-10 km apart in the Wosera (Gwinyingi, Patigo, Nindigo, Kitikum, Wisokum (1 and 2) and Tatemba) ( Figure 1). In Madang Province, the survey took place in March 2006 with 1,275 samples collected from individuals residing in three distinct catchment areas including ten villages within 5-20 km of Mugil (Dimer, Karkum and Matukar/Bunu), Malala (Amiten/Susure, Malala/Suraten and Wakorma) and Utu (Utu) health centres ( Figure 1). As samples from three pairs of villages in Mugil and Malala were combined, and the Utu villages were collected during single surveys of three nearby villages, the samples represent the parasite populations of a total of thirteen villages or neighbouring village-pairs. For simplicity "village" is used throughout the manuscript. Ethical approval to conduct the study was granted by the  Genomic DNA was extracted from whole blood samples using the 96 well QiaQuick DNA extraction kit (Qiagen). To identify P. falciparum positive samples and the number of infecting clones concurrently, the highly polymorphic antigen gene msp2 was genotyped as previously described [36]. This approach utilizes a nested multiplex PCR to simultaneously amplify msp2 from genomic DNA with different fluorescent primers specific for 3D7 and FC27 allele families. Agarose gel electrophoresis identified samples with a positive PCR result and thus P. falciparum infection. Fluorescently labelled PCR products were then analysed with an ABI capillary electrophoresis platform with the internal size standard GS LIZ500 (Applied Biosystems). Resultant chromatograms were analysed with Peak Scanner V1.0 software (Applied Biosystems) to count the number of 3D7-and FC27-specific peaks (alleles) and thus estimate the total number of P. falciparum clones. From this we calculated two molecular epidemiological correlates of P. falciparum transmission intensity for each village: (i) the infection prevalence, calculated as the proportion (%) of samples with a positive PCR result and (ii) the proportion of infections with multiple clones, defined by the presence of more than one peak (allele) on the chromatogram. Only the P. falciparum isolates shown to contain single msp2 alleles were used for microsatellite genotyping.

Microsatellite genotyping
Due to the low parasitaemia and limited quantity of the field samples, whole genome amplification of the selected P. falciparum isolates was performed using the Illustra Genomiphi V2 DNA amplification kit (GE Healthcare) according the manufacturer's instructions. Each of these isolates were then genotyped using ten putatively neutral microsatellite markers developed by Anderson and colleagues ( [35] TA1, TA60, Polyα, ARA2, Pfg377, TAA87, TAA42, PfPK2, TAA81 and 2490) with a reduced primer concentration of 0.08 mM. Fluorescently labelled PCR products were visualized with an ABI capillary electrophoresis platform and resultant chromatograms analysed using Peak Scanner V1.0 software (Applied Biosystems) to define alleles. Several isolates showed multiple peaks (alleles) with secondary peaks having a height greater than 30% that of the predominant peak, indicating the presence of multiple clones [35]. This was expected because some clones will share msp2 alleles and thus can only be distinguished by genotyping at additional loci. True "single infections" were defined as those containing only one allele for all microsatellite loci and those with two alleles at only one locus. The latter was a precaution against genotyping artefacts or within-clone variation that may result in two or more visible peaks on the chromatogram. "Multiple infections" were, therefore, defined as those in which at least two loci contained multiple alleles. Following the methodology of Anderson et al [35], multiple infections with only two alleles were included in the dataset by reconstructing haplotypes from the predominant peaks for each locus. All isolates with more than two alleles at any locus were excluded. At least 75% of the isolates were genotyped successfully for each of the ten loci.

Population genetic analysis
Allele frequencies for the 13 villages and overall were determined using CONVERT version 1.31 software. This software was then used to generate input files for the various population genetic software used [37]. Genetic diversity was assessed using ARLEQUIN version 3.11 software [38] by determining the number of haplotypes (h), the number of alleles per locus (A) and the expected heterozygosity, calculated as where p is the frequency of the i th allele and n is the number of alleles in the sample. Because A is strongly influenced by sample size it is only reliable for large sample sizes (e.g. catchments) therefore we also calculated the allelic richness (R s ) which is normalized on the basis of the smallest sample size and based on the rarefaction method developed by Hurlbert [39] and implemented in FSTAT version 2.9.3 software [40]. Associations between the latter two diversity indices and correlates of transmission intensity were measured by Spearmans rank correlation test using SPSS version 17. To measure multilocus LD (non-random associations among loci), the standardized index of association (I S A ) was calculated using the program LIAN version 3.5 [41] for the whole dataset and a curtailed dataset with haplotypes only from confirmed single infections, as a precaution against the bias that may result from presence of any false dominant haplotypes [2]. As only complete haplotypes could be analysed by LIAN version 3.5, to maximize sample size, this analysis included only eight loci (TA1 and TAA42 were excluded). Due to the small size of the dataset within some villages, LD was calculated only on the scale of each catchment. Population differentiation was estimated by using two pairwise distance measurements: F ST (θ, which estimates the weighted average F statistics over all loci based on the number of different alleles between haplotypes [42]; and R ST which calculates F statistics from the sum of the squared size difference (i.e. number of repeat units) between haplotypes [43] using only the seven microsatellite loci that follow the simple step-wise mutation model (TA87, ARAII, Pfg377, 2490, TA81, PfPK2 and TA60; [44]).
Significance for both F ST and R ST was tested by comparison with 95% confidence intervals from 1023 permutations. As R ST considers the distances between alleles it is the more sensitive of the two statistics. Correlations between genetic differentiation and geographic distance (the shortest distance in km, as defined by the exact distance between geographic co-ordinates) were measured using the Mantel test [45] in FSTAT version 2.9.3 [40]. As small sample size may result in a biased estimate of genetic differentiation the Mantel tests included only villages with n ≥ 22. To confirm the population structure identified by F statistics, Structure v. 2.3 software [46] was also used to test whether each haplotype clustered according to geographic origin. Structure assigns individual multilocus haplotypes probabilistically to one of a number of clusters (K) or jointly to multiple clusters (admixture) based on the allele frequencies at each locus [46,47]. The analysis was run 20 times for K = 1-20 for 10,000 Monte Carlo Markov Chain (MCMC) iterations after a burn-in period of 10,000 using the admixture model and correlated allele frequencies for the analysis. The most likely K was defined by calculating the rate of change of K, ΔK, according to the method of Evanno et al [48] and geographic population structure determined by assessing whether the ancestry coefficients were asymmetric among sampling locations [47]. To further visualize the complex relationships among haplotypes that might result from recombination a weighted network approach that connects haplotypes if they shared at least three alleles was utilized. Network analysis was done using the free software Cytoscape [49]. Each node within the network represents an individual haplotype, and edges between nodes represent shared alleles between haplotypes. For visual clarity, a threshold was set such that nodes were only joined by edges if they shared more than three loci. Modifications of this threshold value did not qualitatively change the structure of the network. Above this threshold, the edges in the network were weighted according to the number of shared alleles. Missing data points were assumed to be different between loci. An edge-weighted spring-embedded algorithm was used to construct the network. Based on Kamada and Kawai's notion of "force-directed" networks [50], the algorithm treats nodes as objects that repel each other dependent on a spring force between them, which is modified by the weight of the edge.

Genetic diversity
There were as many haplotypes (h) as isolates successfully genotyped (n) in the dataset showing that all haplotypes were unique ( Table 1). The inland catchment of Utu had the lowest mean number of alleles (A), allelic richness (R s ) and expected heterozygosity (H e ). Because it had a larger sample size, Wosera had the highest A but the normalized statistic R s, was similar to that of Mugil and Malala and it had the second lowest H e after Utu. For the villages, Utu and several of the Wosera villages had the lowest values for all diversity parameters compared to the majority of coastal villages (Table 1). It should be noted that the different R s values observed for Utu considered as either a catchment or village were the result of recalculation on the basis of the smallest sample size [39]. In contrast to the inland parasite populations, the coastal catchments of Mugil and Malala and the villages within them showed some of the highest values for all diversity parameters. In addition to the variable levels of diversity observed among catchments, R s and H e were highly variable within the catchments of Malala and the Wosera (Table 1). There was no significant correlation between genetic diversity (R s and H e ) and the correlates of transmission intensity (Additional file 2). Less diversity was observed within villages compared to the catchments and also within catchments compared to the total (note that it was only possible to compare H e between the two scales). In addition, allele frequencies varied among sites (Additional file 3). Higher levels of diversity among compared to within populations and differing allele frequencies between populations indicate the presence of population structure within the study area.

Multilocus linkage disequilibrium
Non-random associations among loci (multilocus LD) were measured for all complete haplotypes (n = 159) and also those from single infections (n = 111) by calculating the Index of Association (I A S ). The latter analysis was used to confirm LD in the absence of haplotypes predicted from multiple infections, which can result in higher estimates of recombination and thus bias against the detection of LD. To check whether associations may have arisen from clonal propagation, LD can be measured among unique haplotypes [51], but this was not necessary because all haplotypes in the dataset were unique (Table 1). Consistent with the high proportion of multiple infections in all populations (Additional file 1), no significant LD was identified within any of the catchments for both the full dataset and for the single infections, but LD was significant when all catchments were combined (total; Table 2).

Population structure
The calculation of population pairwise differentiation using both F ST and R ST showed significant differentiation among the catchment areas ( Table 3). The strongest differentiation was observed between the inland and coastal catchments, and the weakest between the coastal catchments and between Malala and Wosera. Pairwise analysis of the differentiation between villages provided further insight into the structure among  The cluster analysis for the full dataset initially defined three clusters (i.e. the highest value of ΔK occurred at K = 3; Additional file 6). For this distribution, Utu and Wosera haplotypes were predominantly assigned to the same cluster (Figure 2A) possibly because of the weaker population structure between the two inland populations than that between inland and coastal populations (Table 3). Confirming this, separate analyses for the inland and coast haplotypes clearly defined two distinct populations for each dataset (ΔK peaked at K = 2; Additional file 6) with the majority of haplotypes from different catchments assigned to different clusters ( Figure  2B). Therefore, the distribution of all haplotypes among four clusters more appropriately summarizes the geographic population structure in Papua New Guinea (K = 4, Figure 2A). To further investigate the possibility of weak local population structure within catchments, we also reran the analysis separately for each catchment. This revealed two clusters for all catchments except Utu, which had three clusters (ΔK peaked at K = 2 for Wosera, Mugil and Malala but at K = 3 for Utu; Additional file 6). However, the clustering patterns were relatively symmetric among villages, except for the Malala catchment in which Amiten/Susure haplotypes were  assigned predominantly to only one cluster ( Figure 2C). For the larger datasets Amiten/Susure haplotypes were predominantly assigned to the same cluster as those from Mugil (Figures 2A and 2B). Therefore, the cluster analysis confirmed geographic population structure among catchments and within the Malala catchment.
The population structure detected within other catchments by this analysis was not a consequence of the spatial separation of parasite populations.
Confirming the above analyses of population structure between catchments, the network shows that the majority of connections were between haplotypes from the same catchment ( Figure 3). The Utu haplotypes formed a densely connected central cluster with few weakly linked nodes consistent with the lower diversity in the catchment, whereas the more diverse Wosera, Malala and Mugil haplotypes were more loosely connected to each other but formed separate lobes of the network radiating from the centre (Figure 3). The tightly connected Utu haplotypes can be explained by the presence of high frequency alleles for three loci (TAA109, TAA42, 2490; Additional file 3). The smaller peripheral network containing haplotypes from Wosera, Mugil and Malala indicates a panel of related haplotypes that shared fewer than three alleles with any of the haplotypes in the main network (Figure 3). Individual networks for each catchment consisted of a single lobe, with connections both within and among villages arguing against the presence of population structure (Additional file 7). For the Malala catchment though, the majority of Amiten/Susure haplotypes were more closely connected at the top of the network consistent with the F ST and cluster analyses.

Discussion
This is the most extensive study to date investigating the genetic structure of P. falciparum populations of Papua New Guinea. Included in the study were the parasite populations of thirteen villages (or village-pairs) distributed over two inland and two coastal catchment areas in the north of the country where malaria research and control efforts are focused. A previous analysis of the same set of microsatellite loci in two coastal villages (Buksak and Mebat) approximately 80 km apart in nearby areas of Madang Province reported a high degree of genetic diversity (H e = 0.62 -0.65), a lack of significant LD (I S A = 0.0055-0.0073; P > 0.05) and minimal differentiation between the two populations (F ST = 0.015) [2]. We have similarly identified high levels of diversity and a lack of LD, however by surveying many more villages over a larger area, we have discovered a wider range of diversity than previously shown (H e = 0.64-0.77) and that parasite populations are heterogeneous with moderate to very high population structure detected throughout the study area (F ST = 0.05 -0.33; P < 0.01). Differences between the study of Anderson et al [2] and the current findings are consistent with a variety of P. falciparum population structures throughout Papua New Guinea.
Levels of diversity are an indication of the fitness of the parasite population and thus how difficult it may be to target with drugs or vaccines. The diversity among catchments was high but also variable, with the inland populations of Wosera and Utu having lower levels of allelic richness and heterozygosity than the coastal populations of Malala and Mugil. The lack of association between the molecular epidemiological correlates of transmission and diversity in Papua New Guinea suggest that a number of factors influence the population genetics of P. falciparum in the region, these are discussed below. Given the potential difficulty in controlling diverse parasites, such knowledge has important practical implications for malaria control across the country.
As Papua New Guinean parasite populations had high levels of genetic diversity it was not surprising to find a lack of significant LD in all four catchments studied, while the LD found for the total dataset can be explained by the Wahlund effect due to the observed population structure [52]. Within each parasite population a large proportion of multiple infections was found and therefore cross-fertilisation and recombination between distinct parasite genomes would be expected to maintain random associations among loci. LD has important implications for the spread of multilocus drug resistance haplotypes, with high levels of inbreeding increasing their dispersal. In Papua New Guinea, the lack of LD combined with the geographic population structure would be unlikely to facilitate such events.
The population structure between the inland (Wosera and Utu) and coastal populations (Malala and Mugil) indicated the existence of barriers to gene flow and thus parasite migration, and other possible influences on population structure such as natural selection and genetic drift within each catchment. Although only~50 km from the provincial capital of Madang town, it takes several hours to travel to the remote Utu village with only one road entering and leaving and there is no direct route of travel (road or air) between Madang and the East Sepik Provinces. Whereas, the lower extent of population structure between Malala and Mugil, and small amount of mixing between these two populations indicated in the cluster and network analyses probably reflects their direct connection via the Pacific Highway. Despite this direct route of possible gene flow between Malala and Mugil, the population structure was significant. Migration of diverse parasites into these locations via human movement may partially explain this observation. For Malala, there is a boarding school with students attending from across Madang and East Sepik Provinces. Lower levels of differentiation among catchments occurred between Wosera and Malala so it is plausible that some gene flow occurs between these locations via movement of students and their guardians. In Mugil, there is constant movement of people ferrying to and from the well-populated Karkar Island (17 km of the coast), however the lack of samples from this location makes this speculation difficult to confirm. The genetic differentiation and cluster analyses also indicated that population structure occurs on a local scale (< 20 km). In particular, in the Malala area the village of Amiten/Susure which is located~6 km inland was found to be genetically distinct from Malala/Suraten and Wakorma which flank the school. In fact, these analyses suggested that the Amiten/Susure population was more similar to parasites from the Mugil villages, suggesting that they may represent the "true Madang coast" population. In Mugil and Wosera, some villages were also differentiated. Here, the low but significant F ST values can be explained by small sample sizes for all except for that between Matukar/Bunu and Karkum (Mugil), and Nindigo and Gwinyingi (Wosera). However, the cluster and network analyses did not suggest any geographic population structure within catchments other than Malala. An isolation-by-distance model did not explain the observed geographic population structure, suggesting that there is a non-continuous distribution of diversity in Papua New Guinea. This fragmented population structure may be explained by movement of the human host with a lack of transport between catchments combined with higher rates of migration into coastal catchments as described above.
There are other possible explanations for the patterns of population structure within Papua New Guinea. People from different catchments belong to different language groups [29], indicating historical separation of human populations and presumably the parasites infecting them. A different prevalence of genetic polymorphisms that protect against malaria between Madang and the East Sepik [30] might have also provided unique selective pressures for the respective parasite populations. In addition, a possible role for the anopheline vector in shaping the observed population genetic structure of P. falciparum in Papua New Guinea cannot be ignored. At least six distinct anophelene species transmit malaria in the region. In Madang Province, Anopheles farauti 4 is predominant in the inland villages such as Utu and Amiten/Susure whereas A. faurauti 1 is more common in villages that are proximal to the coast [31,32]. In the Wosera, Anopheles koliensis and Anopheles punctulatis are the predominant vectors [27]. The population structure observed was consistent with these vector distributions. In Africa, investigators have found no evidence of P. falciparum population structure between two co-existing vectors, Anopheles gambiae and Anopheles funestus [53] suggesting that transmission by these vector species is not a strong barrier to gene flow. In Papua New Guinea though, the vector species distribution is highly heterogeneous with a limited overlap [27,31,32] so the ability of the different species to transmit allopatric parasites would have to be tested. Other factors that influence transmission intensity such as the use of bed nets [20] may also impact on the overall population structure. Whatever the explanation, the population structure of P. falciparum in Papua New Guinea is likely the result of a combination of factors, including the limited movement of both human and mosquito hosts, in addition to the greater accessibility of the coast in comparison to the inland populations. The public health implication for these findings is that parasite populations that might be assumed to be similar for development of malaria control strategies, such as vaccines, in fact are genetically distinct and thus may respond differently to such interventions. However, some populations may be easier to control if they are isolated from external sources of parasites. Utu, having the least diverse and most genetically differentiated parasite population appears to be the most isolated catchment, and thus a location where malaria control strategies may be the most efficient.

Conclusions
A detailed understanding of the population genetics of P. falciparum can help guide malaria control efforts. Such knowledge is becoming paramount as the Papua New Guinean government prepares to intensify malaria control, not only for guiding these control efforts but also monitoring whether they are having an impact on parasite populations. This broad spatial survey of the population genetics of P. falciparum in Papua New Guinea has identified high but variable levels of genetic diversity, random associations among loci and population structure found at different spatial scales. The results have significant implications for malaria control in the Pacific region and show that countrywide population surveillance is needed throughout Papua New Guinea.