Sexual recombination is a signature of a persisting malaria epidemic in Peru

Background The aim of this study was to consider the impact that multi-clone, complex infections have on a parasite population structure in a low transmission setting. In general, complexity of infection (minimum number of clones within an infection) and the overall population level diversity is expected to be minimal in low transmission settings. Additionally, the parasite population structure is predicted to be clonal, rather than sexual due to infrequent parasite inoculation and lack of recombination between genetically distinct clones. However, in this low transmission of the Peruvian Amazon, complex infections are becoming more frequent, in spite of decreasing infection prevalence. In this study, it was hypothesized that sexual recombination between distinct clonal lineages of Plasmodium falciparum parasites were altering the subpopulation structure and effectively maintaining the population-level diversity. Methods Fourteen microsatellite markers were chosen to describe the genetic diversity in 313 naturally occurring P. falciparum infections from Peruvian Amazon. The population and subpopulation structure was characterized by measuring: clusteredness, expected heterozygosity (He), allelic richness, private allelic richness, and linkage disequilibrium. Next, microsatellite haplotypes and alleles were correlated with P. falciparum merozoite surface protein 1 Block 2 (Pfmsp1-B2) to examine the presence of recombinant microsatellite haplotypes. Results The parasite population structure consists of six genetically diverse subpopulations of clones, called "clusters". Clusters 1, 3, 4, and 6 have unique haplotypes that exceed 70% of the total number of clones within each cluster, while Clusters 2 and 5 have a lower proportion of unique haplotypes, but still exceed 46%. By measuring the He, allelic richness, and private allelic richness within each of the six subpopulations, relatively low levels of genetic diversity within each subpopulation (except Cluster 4) are observed. This indicated that the number of alleles, and not the combination of alleles, are limited. Next, the standard index of association (IAS) was measured, which revealed a significant decay in linkage disequilibrium (LD) associated with Cluster 6, which is indicative of independent assortment of alleles. This decay in LD is a signature of this subpopulation approaching linkage equilibrium by undergoing sexual recombination. To trace possible recombination events, the two most frequent microsatellite haplotypes observed over time (defined by either a K1 or Mad20) were selected as the progenitors and then potential recombinants were identified in within the natural population. Conclusions Contrary to conventional low transmission models, this study provides evidence of a parasite population structure that is superficially defined by a clonal backbone. Sexual recombination does occur and even arguably is responsible for maintaining the substructure of this population.


Background
The events that shape a local population structure of Plasmodium falciparum are not fully understood. There is a definite link between the parasite transmission rate and the amount of circulating genetic diversity within a local population; high transmission regions typically have larger pools of genetically diverse parasites, while low transmission regions typically have a limited pool of genetically diverse parasites. However, understanding how this diversity is maintained within a population is fairly unresolved due to the parasite's intricate life cycle, which has maintained both sexual and asexual phases. Unfortunately, the depth of the answer to this question is constrained by the genetic markers and molecular methods used unravel the evolutionary history by surveying the population diversity. Until deep sequencing becomes more feasible, microsatellite markers provide a relatively thorough guide to the local, and perhaps global, evolutionary history of P. falciparum [1], as well as other eukaryotic parasites [2][3][4][5][6][7][8][9][10][11][12]. Considering how genetic diversity is maintained and/or generated on a local level is a priority for advancing epidemiological applications on a molecular level by understanding how this parasite evolves in natural settings.
The contribution of sexual recombination between two unrelated P. falciparum clones on the population structure remains a topic for debate. Several studies have concluded that P. falciparum parasites are essentially clonal due to infrequent recombination between different clonal lineages, while others have concluded that sexual recombination occurs more frequently and contributes greatly to the overall population structure [13][14][15][16][17][18][19][20][21][22][23][24][25][26]. Unfortunately, measuring the contribution of population-level genetic diversity generated by sexual recombination is not straightforward, particularly in an organism like P. falciparum, which spends a great proportion of its life cycle in a haploid, asexual form. Several studies have investigated the contribution of genetic diversity though sexual recombination by measuring the non-random mating preference within the mosquito gut; however, data from these studies versus studies of recombination between clones from genetically distinct lineages, are incongruous [26][27][28][29].
Recombination between genetically distinct clones (or clonets) can only occur if a mosquito takes up more than one during a single blood meal or accumulates multiple clones during several interrupted feedings. The probability of a mosquito carrying multiple clones is then a function of the longevity and biting rate of the mosquito, mosquito uptake of gametocytes from different parasites clones (or clonets), and the amount of genetic diversity circulating within a parasite population [30][31][32][33][34]. In high transmission regions, the contribution of sexual recombination is likely to be higher and more apparent than in low transmission regions, due to increased opportunities for mosquitoes to accumulate different clones (or clonets) within the same blood meal [35][36][37][38][39][40][41]. Furthermore, regions of high transmission tend to have higher levels of genetic diversity with a notable decay in the linkage disequilibrium between loci, while in low transmission regions, the genetic diversity is markedly lower and disequilibrium between genetic markers is high [42]. However, even if sexual recombination between parasites of different lineages occurs at a low frequency, as is proposed in low transmission regions, the survival of the recombinants might impact the population structure and epidemiology of malaria. To test this, a longitudinal approach is necessary.
Since 2003, the Malaria Immunology and Genetics in the Amazon (MIGIA) project has been longitudinally following a low transmission (< 0.5 P. falciparum infections/person/year) cohort of approximately 2,000 participants located south of Iquitos, Peru in the community of Zungarococha. As this project progressed, snap-shot glances of the population-level diversity were made by using single copy, antigenic P. falciparum merozoite surface proteins (Pfmsp)-1, -3, and -6. Characterization with these markers showed that the overall populationlevel diversity was comparable to that of other low transmission regions [41,[43][44][45]. However, now with more than five years of longitudinal sampling data, it is clear that the population structure is far more complex than expected. A recent study, using 14 microsatellite markers scattered across 10 chromosomes, found 182 genetically distinct microsatellite haplotypes from 302 P. falciparum infections occurring between 2003 and 2007 [46]. Extensive population genetic analysis showed that the immigration of new clones into this low transmission population was highly improbable and could not explain the population-level diversity [46]. The investigators concluded that the level of parasite genomic diversity was effectively maintained on a temporal (2003)(2004)(2005)(2006)(2007) and spatial (four networked villages) level by the loss of once frequent clones and the introduction of newly evolved ones [46].
The objective of this study was to understand how this level of diversity has been maintained temporally within this low transmission population. Across five years of longitudinal sampling, there has been a significant increase in the proportion of complex infections to single-clone infections, despite decreasing P. falciparum infection prevalence over time. This increased proportion of complex infections is curious, but not unique; a similar finding was observed in Thailand [39]. Such a phenomenon has the potential to set up an ecological scenario, whereby the constituents of complex infections have been provided with an increased opportunity to recombine over time.
Consequently, it was hypothesized that sexual recombination between distinct clonal lineages of Plasmodium falciparum parasites were altering the subpopulation structure and effectively maintaining the populationlevel diversity in this local population. Sexual recombination between distinct clones (clonets) was traced in this population using 14 microsatellite markers scattered across 10 chromosomes and then correlated with an independent antigenic marker, Pfmsp1 Block 2 (Pfmsp1-B2). The main allelic families of Pfmsp1-B2 were found to correlate significantly with specific alleles from each of the microsatellite loci and were also found to have distinct microsatellite haplotype signatures. On a population-level, this resulted in the observation of a rigid clonal backbone that could be defined by Pfmsp1-B2 main allelic families. However, this clonal population structure breaks down due to increased sexual recombination between constituents of complex infections that have been propagated together over time. The consequences of such sexual recombination could impact parasite evolution of drug resistance, host's susceptibility to infection, and even the expansion of new clonal variants.

Study design
DNA samples were selected from the Malaria Immunology and Genetics in the Amazon longitudinal cohort study in the community of Zungarococha, located in the San Juan District south of Iquitos, Peru. Details on the study site and design are described in [47]. In brief, this longitudinal study employs both active and passive case detection methods. Of the community of approximately 2,100 individuals, 1,945 individuals participated each year. In active case detection (ACD) there are at least six household visits during the seven month malaria season: one at the beginning of the malaria season, followed by four consecutive weekly visits during malaria season, and then one at the end of the malaria season.
Infections were defined as one or more malaria-positive sample point(s) during the one-month of follow-up (termed "infection-month). The sampling could, therefore, begin before the initial malaria positive sample or after, with an overall sampling time frame of less than or equal to 30 days (4 weeks of sampling) [41,46,47]. Infections were defined as complex when more than one allele was present within a single sample or over the infectionmonth as determined by Pfmsp1-B2 genotyping. All blood samples, positive or negative by microscopy, underwent DNA extraction by Qiagen DNeasy Blood and Tissue Kits and were tested for presence of Plasmodium species using a nested multiplex PCR method targeting DNA encoding the small subunit ribosomal protein (ssrDNA) (Qiagen Inc., Valencia, CA) [48].
All passive case detection (PCD) samples collected from participating individuals were included in this study. Members of our field team, along with Ministerio de Salud (MINSA) personnel, staff the local health post, which services the community of Zungarococha. All individuals presenting with a fever are screened for malaria infection and treated if an infection is detected. These services are provided at no cost to the patient.
All protocols were reviewed and approved by the Institutional Review Boards at the University of Alabama at Birmingham (former institution), New York University School of Medicine, and MINSA. Written, informed consent/assent was obtained from all participants in this study.

Microsatellite marker selection and amplification
Fourteen microsatellite markers were chosen from the microsatellite linkage map for amplification. Descriptive haplotypes were assembled for clones by concatenating the alleles detected at each microsatellite locus. Infections were classified as complex when two or more loci contained multiple alleles. Haplotype phasing was confirmed by cloning by limiting dilution (see below). All other methods are described in Branch et al. [46]. Summary information on microsatellite marker loci, number of alleles, and the amount of diversity per locus can be found in Additional file 1.

Cloning by limiting dilution
Venous blood from P. falciparum positive individuals was collected in a BD Vacutainer ® (Becton-Dickinson (Becton, Dickinson and Co., Franklin Lakes, NJ, USA) and parasites were harvested and cryopreserved in a culture bank. A selection of 15 samples, determined to have two different alleles (30 different clones) based on Pfmsp1-B2 genotyping, were cultured in vitro. At 3% parasitaemia, small-volume aliquots were passed to new culture flasks with fresh culture media to separate and expand the clones. At each passing, including day 0, a small aliquot of the culture was taken for DNA extraction and amplification by PCR for Pfmsp1-B2 genotyping. Genotyping was performed at each passage until infections were passed as single clones over three passages. Additionally, microsatellite genotyping (using 14 microsatellites) was performed in triplicate on all samples before and after separating the clones by limiting dilution.

Pfmsp1-B2 genotyping analysis
PCR genotyping of Plasmodium falciparum merozoite surface protein 1 block 2 (Pfmsp1-B2) was attempted on all available samples that were microscopy and/or PCR positive for P. falciparum between 2003 and 2007 (858 P. falciparum infection-months) [40,41]. Of the 858 infectionmonths, 814 were reproducibly genotyped. Previously reported genotyping methods were used for main allelic family analysis, using all permutations of primers to detect any allele families that were one allele type in the 5' direction and a different allele type in the 3' direction [40,41,44,49]. Of the 194 complex infections, as defined by Pfmsp1-B2, there were only 46 (23.7%) instances when an infection was found to have more than one clone due to the detection of a different allele over time (≤ 30 days). These 46 infections were distributed over the five years of this study and did not bias the detection of complex infections.
Genetic diversity data analysis Cluster analysis Structure 2.2, a Bayesian programme for grouping genetically related isolates, was used to define the population structure [50]. Methods for this analysis were previously reported in Branch et al. [46]. In brief, individual microsatellite haplotype profiles were assessed by the clustering algorithm and each haplotype was then given an estimated membership coefficient (Q) that translates to the fraction of relatedness (or ancestry) within/between each cluster [51]. Q is used to establish the clusters (K), which are reported as an admixture proportion. Clusteredness was used to determine the most accurate number of clusters representative of this population. Clusteredness calculations measure the average relatedness of the individual membership coefficients (Q) and estimate the extent to which individual infections belong to a single cluster, rather than to a combination of clusters [51]. For each value of K (prespecified to equal between 1 and 10), the clustering algorithm was performed 5 times for 10,000 Monte Carlo Markov Chain iterations, followed by a burn-in period of 10,000 iterations. These results were then output and averaged to assess the clusteredness calculation for each value of K. Population differentiation was confirmed using F st and expected heterozygosity (H e ) (Arlequin v3.5.1.2) [52]. No population structure was observed for any K-value above K = 10.

Minimum spanning network
A minimum spanning network (MSN) was developed in Network 4.5.1.0 using the microsatellite haplotype profiles [53]. For the MSN, the sample identifications were coded with a combination of the cluster-population (Clusters 1-6) and infection-type (complex or singleclone) information associated with each sample, allowing us to (1) visualize the relatedness between parasite infections, (2) examine how closely related the clusters were to one another, and (3) determine the frequency of complex versus single-clone infections in each cluster subpopulation.

Linkage disequilibrium
The standard index of association (I A S ) was used to test for statistical independence of alleles at each of the 14 microsatellite loci for all subpopulations (LIAN v3.0) [54]. I A S uses a robust Monte Carlo simulation that accounts for differences between the observed (V D ) and expected (V e ) mismatch variances of alleles, provides a measurement for linkage and tests for significance. If V D and V e are significantly different, then independent assortment of alleles cannot explain the observed diversity and the alleles are in disequilibrium.

Allelic and private allelic richness
Subpopulation diversity was calculated by measuring the allelic richness (mean number of alleles per locus) and private allelic richness (number of alleles private to a specific population) using a rarefaction approach for both measurements to model the data based on subpopulation sample-size (ADZE v1.0) [55].

Ecology of complex infections Sample selection
Due to the single-copy nature of Pfmsp1-B2, this gene was used to generically discriminate between singleclone and complex infections for all isolates collected in this study between 2003 and 2007. In total, 858 infection-months underwent genotyping by Pfmsp1-B2; 814 of these were reproducibly genotyped and served as the baseline for genetic diversity (Table 1). Based on the proportion of different infection-types (single-clone versus complex infections) detected during each year of this study, a random sample subset was selected to undergo microsatellite genotyping. Of the 814 infectionmonths genotyped by Pfmsp1-B2, 243 were selected for microsatellite analysis (Table 2). Due to the infrequent detection of infections with more than two clones, infections with more than one clone were defined "complex", while infections with only one clone were defined as "single-clone".

Pfmsp1-B2 analysis
The proportion of complex versus single-clone infections for each year was calculated (excluding year-2003, due to collection beginning after the start of malaria season). There was a 2. This increased detection of complex infections over time is not due to sampling bias. The average number of complex infections detected through passive case detection (PCD) is significantly lower than those detected though active case detection (ACD) for each  (Figure 2). This is important and relevant because the increased proportion of complex to single-clone infections is attributable to complex infections containing both Mad20 and K1 clonal variants ( Figure 2).

Microsatellite analysis
Alleles determined by each of the 14 microsatellite loci were assembled into a single haplotype for each clone within an isolate. To confirm the accuracy of microsatellite haplotype phasing, 30 parasite clones were isolated from complex infections by in vitro culture, using cloning by limiting dilution methods (See methods). Microsatellite genotyping was performed before and after isolating the 30 clones. Twenty-three of the 30 clones were identical to the microsatellite haplotypes compiled without cloning out by limiting dilution. Of the seven discrepant microsatellite haplotypes, five had discrepancies in one locus (92.9% match with non-isolated haplotype) and two had discrepancies in two loci (85.7% match with non-isolated haplotype). However, the correct locus was always observed in at least one or two of the genotyping replicates, but never all three as was observed in the 23 clones.
In total, there were 313 clones detected in the 243 infection-months selected for microsatellite analysis; 62 of these infection-months were determined to be complex infections. These results were directly correlated to the Pfmsp1-B2 genotypic classification of infection-type and a kappa test was used to assess the strength of agreement between the proportions of complex versus single-clone infections defined by both genotyping methods.  The average COI (total number of clones divided by the total number of infection-months) differed slightly between the Pfmsp1-B2 (305 clones detected in 243 infections, COI = 1.26) and microsatellite genotyping (313 clones detected in 243 infections, COI = 1.29) methods (Table 2). Of the 243 infection-months selected, 54 of these were defined as complex by Pfmsp1-B2 genotyping (43 interfamily and 11 intrafamily), while 62 were defined as complex by microsatellite genotyping (14 markers scattered over 10 different chromosomes) ( Table 2). Detecting an increased number of complex infections using the microsatellite assay was anticipated due to the use of multiple loci. The kappa value was equal to 0.523 (95% CI = 0.447-0.598), indicating moderate strength of agreement between these genotyping methods.

Subpopulation structure
Microsatellite haplotype population structure analyses Structure v2.2 [50] was used to describe the subpopulation structure based on the 14 microsatellite loci haplotypes for all clones (N = 313) from all 243 infections in this sample-set. A post-hoc clusteredness calculation was performed to determine accuracy of predetermined number of clusters (K) [51]. The highest clusteredness value over 10 simulations showed K equal to 6, with a clusteredness value equal to 0.81. Arlequin v3.5.1.2 [52] was used to test the population differentiation (F st ) between these clusters. Each of the six clusters was found to represent independent subpopulations (p < 0.01, Pairwise F st ). On a spatial and temporal level, these six cluster subpopulations were maintained in all four-networked villages across each year of this study.   subpopulations. There was minimal interconnectedness between Clusters 1-5, while Cluster 6 was central in the MSN and had widespread connections within all Clusters, except Cluster 3. This indicates an increased relatedness between Cluster 6 and Clusters 1, 2, 4, and 5. Additionally, Cluster 6 showed an increased number of theoretical progenitors, compared with all other clusters ( Figure 3).

Recombination as a factory for diversity Underrepresented identical haplotypes
For each cluster subpopulation the number of haplotypes present, expected heterozygosity (H e ), allelic richness, and the private allelic richness were considered ( Interestingly, in spite of the high frequency of unique haplotypes within each cluster, there were relatively low levels of genetic diversity within each of these subpopulations. H e (Arlequin v3.5.1.2) was used to measure the genetic diversity based on the relative frequencies of the different alleles within each of the subpopulations. There was low H e in Clusters 1, 2, 3, 5, and 6, with a range of 0.15 to 0.40. To further investigate the subpopulation diversity, allelic richness (mean number of alleles per locus) and private allelic richness (number of alleles private to a specific population) were measured using a rarefaction approach to standardize and model the data based on subpopulation sample-size (ADZE v1.0). After correction for sample-size, Clusters 1, 2, 3, 5, and 6 averaged 4 alleles per locus, with approximately 23% of the alleles detected within each of these five subpopulations identified as private alleles. Cluster 4 deviates from the trend observed in Clusters 1, 2, 3, 5, and 6, with a moderately higher H e equal to 0.593 and an allelic richness equal to 4.71 alleles per locus with approximately 80% of the alleles detected identified as private alleles (Table 3).
In summary, identical haplotypes were underrepresented within this population even though the genetic diversity is quite limited on a subpopulation level. Such a dynamic is only possible if the number of alleles, and not the combination of alleles, is limited. This dynamic supports recombination as a potential contributor to the maintenance of genetic diversity within this population.  (Figure 4). The appearance of new haplotypes is not associated with the appearance of new alleles over time. The majority of the new alleles detected over time in this study were minor in frequency. Additional file 2 illustrates the frequency of each allele detected in this study over time.

Linkage disequilibrium
The standard index of association (I A S ) for each cluster was measured to test for statistical independence amongst alleles (independent assortment of alleles) at each of the 14 microsatellite loci (Table 3). Based on this method, Clusters 1, 2, 3, 4, and 5 were in linkage disequilibrium (LD) (p < 0.005, 10000-step Markov Chain analysis, one-tailed t-test, LIAN v3.0), while break down in LD was observed in Cluster 6 (p = 0.072, 10000-step Markov Chain analysis, one-tailed t-test, LIAN v3.0) ( Table 3). This decay in LD associated with Cluster 6 is indicative of an independent assortment of alleles, which is a signature of this subpopulation approaching linkage equilibrium by undergoing sexual recombination.

Breakdown in correlation between independent sets of genetic markers in the complex infection subpopulation
Next, Pfmsp1-B2 genotypes were correlated to the six cluster subpopulations as determined by using the 14 microsatellite loci. Using a probability model, the expected Pfmsp1-B2 genotype frequencies were compared to the observed genotype frequencies in all six clusters. With the exception of Cluster 6, one Pfmsp1-B2 main allelic family (K1 or Mad20), and often one specific Pfmsp1-B2 allele (repeat length), significantly correlated with the clones defining each cluster (p < 0.001, χ 2 ). Clusters 1 and 5 were significantly comprised of single-clone Mad20 alleles, while Clusters 2 and 4 were significantly comprised of single-clone K1 alleles. Cluster 6, the exception to this population structure, was significantly comprised interfamily complex infections (K1 plus Mad20), and also had the highest number of interfamily complex infections of all six subpopulations (p = 0.0044, Fisher's exact). The subpopulation structure of Cluster 6 may explain the previously observed decay in LD associated with this cluster, as this cluster may have evolved from recombination events between the two most frequently observed genotypes in this population.

Appearance of recombined haplotypes over time
To further investigate the evolution of Cluster 6, recombination between microsatellite alleles was investigated. Due to the significant correlation between the cluster subpopulations and the Pfmsp1-B2 genotypes, microsatellite haplotypes were identified by their Pfmsp1-B2 genotype: K1, Mad20, or K1 plus Mad20. In total, 57.1% (40 of 70) of the microsatellite alleles were significantly correlated to either the K1 or Mad20 allelic families ( Figure 5A). This enabled a straightforward association of K1 and Mad20 alleles to specific microsatellite alleles identified in the population. On the most basic level, this showed that K1 and Mad20 alleles might be in disequilibrium with specific microsatellite alleles, which has resulted in a rigid clonal backbone in this population. However, this correlation also enabled the meticulous tracing of sexual recombination across the 14 microsatellite loci. To trace possible recombination events, the two most frequent microsatellite haplotypes observed over time (defined by either a K1 or Mad20) were selected as the progenitors and then potential recombinants were identified in within the natural population. Six examples of potential recombinants that may have arisen from recombination between the Mad20-and K1-like dominant haplotypes are illustrated in Figure  5B. There is clear evidence for a breakdown in the correlation between these two independent genetic markers, likely due to recombination between clones alternatively identified by these two main allelic families.
The correlation between dominant Pfmsp1-B2 alleles and microsatellite markers was mapped to visualize the breakdown in the clonal backbone. A breakdown in this correlation would indicate homogenization between these genomes. For analytical ease, only seven of the most highly correlated microsatellite loci were selected for this study. Indeed, there was a discrete clonal backbone in the single-clone infections that was highly linked with the two Pfmsp1-B2 main allelic families, K1 and Mad20 ( Figure 6). However, in the interfamily complex infections the clonal backbone disappears as the correlation between individual microsatellite alleles and Pfmsp1-B2 main allelic families breaks down. Figure 6 clearly illustrates a population with complex infections that have undergone many recombination events, such that the overall population structure of these infections has become markedly more homogenous. Finally, this genome homogenization was examined over time. Populations were divided into "early years" (2003)(2004)(2005)

Discussion
In this study, the evolutionary impact of multiple clone, complex infections within a local and low transmission population in the Peruvian Amazon between 2003 and 2007 was investigated. Previously, researchers have found that in exceedingly low transmission regions, such as the Karen Refugee Camp in Thailand, there were more complex infections than expected by chance [39]. Though unexpected in low transmission settings, a niche with a high frequency of complex infections increases the potential for sexual recombination between genetically distinct clones within local populations. The formation, propagation, and ultimately the survival of complex infections in low transmission regions are of particular interest because these types of infections provide an increased opportunity for genetic diversity to be maintained through recombination. A study of this nature is not feasible in high transmission regions because the complexity of infection is generally high due to overlapping infections and virtually impossible to separate infections into individual clonal constituents. : : : Insignifnicant correlation of allele Pfmsp1-B2 allelic family : : b a b a a a a a a a a a a   1: b b a b a a a b a a a a a a   2: b b a b a b a b a a a a a a   3: b a a b a a a a a a a a a a   4: a a a a a a a a a a a a a a   5: a a b a a a a a a a a a a a   6: a a b a a b a a a a a a a a   K1-Like:  a a b a b b a b a a a a a  A. Pfmsp1-B2

C13M13
Allele signfinicantly correlates with Mad20 Allele signfinicantly correlates with K1   PF2802  TA42  TA109  TA1  BM17  B5M5  An earlier study, which examined the ecology of complex infections in this cohort from 2003-2004, found that complex infections within humans were not formed by an accumulation of clones by superinfection, but rather, complex infections were being propagated as a single unit during a blood meal [41]. In this study, Pfmsp1-B2 was genotyped from all P. falciparum positive samples collected between 2003 and 2007 to determine the frequency of complex infections over time. This finding was supported by the sampling and genotyping strategies used in this cohort study. All samples collected, regardless of a positive or negative diagnosis by microscopy undergo species-specific PCR. Due to the fact that many of infections in this community are asymptomatic, it is particularly important to consider the possibility of subpatent carriers. For this reason, sampling occurs over a one-month time-period and final genotypes for all infections are concatenated from multiple sample collections (4-6 collections over 4weeks), rather than a single time point; this time-point Figure 6 Model illustrating the clonal backbone of single-clone infections (A) and homogenization between constituents of complex infections (B). Each vertical column of circles represents a specific microsatellite locus (seven total for both A and B) and the number of independent circles within each vertical column represents the number of alleles detected within that particular locus. Blue and yellow coloring is indicative of a correlation with either Mad20 (blue) or K1 (yellow) Pfmsp1-Block2 main allelic families, respectively. The size of the colored circles represents the number of infections indentified by a particular allele that was associated with either Mad20 or K1. When circles are superimposed, this shows that alleles were correlated (with varying proportions based on circle-size) to both Mad20 and K1. Lines between loci indicate the number of instances (differentiated by the line thickness) that one allele paired with another allele in the population. A) Illustrates the clonal backbone present within single-clone infections, indicated by having a predominantly blue (Mad20) clonal family and a predominately yellow (K1) clonal family. This is further supported by a fewer number of lines between loci pairs and predominantly clonal haplotypes. B) Illustrates a breakdown in linkage between loci pairs, with an increased number of loci pairs and a relatively equal distribution of clones correlated to each Pfmsp1-B2 main allelic family causing the disappearance of the clonal backbone.
genotyping is referred to as infection-months in this manuscript.
When the longitudinal genotyping data was correlated to infection prevalence data between years 2003 and 2007, the frequency of complex infections were found to increase significantly from 16.3% in 2004 to 35.6% in 2007 in spite of a substantial decrease in the infection prevalence from 0.31 infections/person/month in 2004 to 0.12 infections/person/month in 2007, while the (p < 0.001, χ 2 ). This dynamic does not support an environment whereby complex infections were being generated by superinfection, but rather it provides additional evidence that the complex infections in this low transmission region were being propagated as single evolutionary-units.
The increase in complex infections relative to the single-clone infections was striking given the overall decrease in infection prevalence over the years of this study ( Figure 1). Two different scenarios can be speculated to explain this increase over time that is not related to the human host immune response. One possible explanation is that the increase in complex infections over time is an ecological phenomenon, rather than differential survival probabilities of complex versus single-clone infections. For example, it is possible that there is an ecological niche that maintains the transmission of complex infections over time, like mosquito breeding sites or potential reservoirs. However, the distribution of complex infections over time within each of the four networked-villages was proportionate, with no indication that one or more of the villages have a greater frequency of complex infections. Another possible explanation is that complex infections are more readily transmitted from the human host to the mosquito host [56][57][58][59][60][61][62][63][64][65][66][67][68]. This may be linked with an environmental stimulus that signals an increased production of gametocytes when parasites are under stress [59][60][61][62][63], but it is still unclear whether or not this may also be linked to disease outcome within the human host.
Regardless of why complex infections were increasing in the population, the result of such infections being propagated longitudinally would be a reduced chance of encountering the same genotype over successive infections due to increased opportunities for recombination [64]. The hypothesis in this study was that despite a superficial clonal population structure, sexual recombination between distinct clonal lineages of P. falciparum parasites is altering the sub-structure of this population. To directly test this hypothesis, previously defined criteria for distinguishing between clonal and sexual populations was used [25].
Using microsatellites to longitudinally characterize the population and within-host level genetic diversity, it is clear that the parasite population can be divided into six distinct cluster subpopulations that were maintained over time in this study. The frequency of identical and unique microsatellite haplotypes were then investigated within the total population and within each of the six cluster subpopulations. Of the 313 clones in the total population, there were 197 unique haplotypes observed (Table 3). These unique haplotypes were not isolated to a particular subpopulation, but rather were found across each of the six subpopulations (clusters) of related clones. Clusters 1, 3, 4, and 6 have unique haplotypes that exceed 70% of the total number of clones within each cluster, while Clusters 2 and 5 have a lower proportion of unique haplotypes, but still exceed 46% (Table 3). This contradicts what is expected in the null hypothesis.
The excess of unique haplotypes was further investigated by measuring the H e , allelic richness, and private allelic richness within each of the six cluster subpopulations (Table 3). These analyses revealed that the number of alleles is limited, while the combination of alleles (which comprise the haplotypes), is not limited. This indicated that a breakdown in linkage between the loci could help to explain the excess of unique haplotypes. Next, a test for independent assortment of alleles was performed using the standard index of association (I A S ). This test indicated a significant decay in LD restricted to Cluster 6 ( Table 3). On a parasite level, predicting the genetic exchange during one round of sexual recombination is at best an educated guess, but over ten years of persisting recombination, it is virtually impossible to do, even with sophisticated models. Due to the significant decay in LD associated with Cluster 6, this cluster became a model for recombination in this study; however, Cluster 6 was not the only cluster where recombinants were identified.
To further consider the breakdown of LD in Cluster 6, the correlation between the microsatellite haplotypes and the single-copy antigenic locus Pfmsp1-B2 was examined. Using Pfmsp1-B2 as a generic and independent genetic marker permitted a broad characterization of the genome and further insight into the ecology of the infections analyzed. Five of these subpopulations (Cluster 1-5) were significantly comprised of singleclone infections that significantly correlated with specific Pfmsp1-B2 main allelic family genotypes (either K1 or Mad20), while Cluster 6 was significantly comprised of complex infections that were a mixture of Pfmsp1-B2 main allelic family genotypes (K1 and Mad20). A breakdown in linkage could be readily observed by considering the linkage in the single-clone (K1 or Mad20) versus complex clone infections (K1 and Mad20) ( Figure 5; Figure 6). The population structure of the single-clone infections define a clonal backbone with strong linkage to the antigenic individual Pfmsp1-B2 genotypes, while the complex infections undergo apparent recombination, which breaks down the linkage between alleles in this backbone. The result of this is an increased number of unique haplotypes over time ( Figure 4). Further, it was possible to single-out potential progenitors of recombined clones, which strengthens the evidence of sexual recombination contributing to the parasite genetic diversity within this population ( Figure 5).
Repeated or recursive recombination naturally results in the homogenization of different parasite clones or strains, unless there is some barrier to recombination that maintains a clonal backbone structure. Although microsatellite markers are considered neutral, there might be linkage with antigens scattered across the genomes. If this is the case, the recombination may breakdown linkage between antigen encoding genes, which is particularly useful if the human host is immune to reinfection by recently encountered genotypes [34,[64][65][66][67][68][69][70][71][72]. Others have suggested that recombination is favorable in changing environments, like parasite-host infections that are often modulated by the immune response [34,72]. However, the observation of recombined microsatellite haplotypes in this study could also be due to one particular locus under selection. That is, it is possible that a particular mutation/allele or combination of alleles, which may have been fixed in the population during the propagation of complex infections over time, resulting in a population with a slight selective advantage [65][66][67][68]73].
In the most practical context, this study shows that genetic diversity can increase even when there is low transmission. This scenario is likely to be reencountered in the future as the current P. falciparum elimination efforts are effectively reducing clinical malaria in many countries across the world. The expansion of recombined clones within a local population could increase host susceptibility to infection, and also provide more opportunities for the evolution of drug resistance.

Conclusions
The aim of this study was to consider the impact that multi-clone, complex infections have on a parasite population structure in a low transmission setting. Contrary to conventional low transmission models, this study provides evidence of a parasite population structure that is superficially defined by a clonal backbone. Sexual recombination does occur and even arguably is responsible for maintaining the substructure of this population. The maintenance of genetic diversity in this population is likely a consequence of the long-term propagation of complex infections in this region. The fact that these recombinants were surviving in this population, despite a decreasing prevalence rate, raises an important question on the fitness of these parasites and/ or the accumulation of clones within one infection.

Additional material
Additional file 1: Microsatellite loci, alleles, and diversity. Description of the microsatellite loci used, the number of alleles detected, and the expected heterozygosity of each locus within all samples used in this study.
Additional file 2: Longitudinal frequency of alleles detected in study. Illustrated here are the frequencies of each allele, within each locus, over the five years of this study. Different shades of gray indicate the proportion of a detected allele with respect to the locus and the year of collection.