- Open Access
Microsatellite genotyping and genome-wide single nucleotide polymorphism-based indices of Plasmodium falciparum diversity within clinical infections
- Lee Murray†1,
- Victor A. Mobegi†1, 2,
- Craig W. Duffy1,
- Samuel A. Assefa1,
- Dominic P. Kwiatkowski3,
- Eugene Laman4,
- Kovana M. Loua4 and
- David J. Conway1Email authorView ORCID ID profile
© The Author(s). 2016
- Received: 5 March 2016
- Accepted: 3 May 2016
- Published: 12 May 2016
In regions where malaria is endemic, individuals are often infected with multiple distinct parasite genotypes, a situation that may impact on evolution of parasite virulence and drug resistance. Most approaches to studying genotypic diversity have involved analysis of a modest number of polymorphic loci, although whole genome sequencing enables a broader characterisation of samples.
PCR-based microsatellite typing of a panel of ten loci was performed on Plasmodium falciparum in 95 clinical isolates from a highly endemic area in the Republic of Guinea, to characterize within-isolate genetic diversity. Separately, single nucleotide polymorphism (SNP) data from genome-wide short-read sequences of the same samples were used to derive within-isolate fixation indices (F ws), an inverse measure of diversity within each isolate compared to overall local genetic diversity. The latter indices were compared with the microsatellite results, and also with indices derived by randomly sampling modest numbers of SNPs.
As expected, the number of microsatellite loci with more than one allele in each isolate was highly significantly inversely correlated with the genome-wide F ws fixation index (r = −0.88, P < 0.001). However, the microsatellite analysis revealed that most isolates contained mixed genotypes, even those that had no detectable genome sequence heterogeneity. Random sampling of different numbers of SNPs showed that an F ws index derived from ten or more SNPs with minor allele frequencies of >10 % had high correlation (r > 0.90) with the index derived using all SNPs.
Different types of data give highly correlated indices of within-infection diversity, although PCR-based analysis detects low-level minority genotypes not apparent in bulk sequence analysis. When whole-genome data are not obtainable, quantitative assay of ten or more SNPs can yield a reasonably accurate estimate of the within-infection fixation index (F ws).
- Microsatellite Locus
- Minor Allele Frequency
- Microsatellite Genotyping
- Single Nucleotide Polymorphism Analysis
Malaria parasite blood-stage infections commonly contain a mixture of different haploid parasite genotypes, particularly in areas of high endemicity where superinfection frequently occurs . Cross-mating and recombination between different genomes of parasites occurring together in a vector mosquito blood meal is, therefore, most frequent in highly endemic areas, whereas in areas of low endemicity inbreeding may be common as most infections contain single or highly related genotypes [1–6]. In an experimental model of malaria using Plasmodium chabaudi in mice, multiple genotype infections have been associated with apparent short-term evolution of virulence , alterations to parasite sex ratio and production of gametocytes [8, 9], and effects on the immune clearance rate . If such processes occur in human malaria, they might also impact on drug resistance evolution [3, 11].
Previous analyses of within-host diversity of P. falciparum, the causative agent of most human malaria cases globally , have typically involved genotyping a small number of highly polymorphic gene loci , or multiple putatively neutral microsatellite marker loci . These have demonstrated wide variation in the genotypic complexity of infections among geographical populations of P. falciparum, which inversely correlates with local levels of multilocus linkage disequilbrium [1, 14–17]. Further dissection of the within-host diversity of P. falciparum infections has been performed using genome-wide single nucleotide polymorphism (SNP) data, showing that a high degree of relatedness is seen among some distinct parasite clones within infections, in comparison to those sampled from separate infections [4, 18]. This illustrates that a multiple genotype P. falciparum infection may be comprised of a mixture of closely related, non-identical parasites, or multiple genetically unrelated parasites, or it may be a complex mixture of both.
The relative proportions of SNP alleles in whole-genome sequence data from an infection can be estimated from the proportions of reads mapping to a reference sequence, and this allows computation of a within-isolate fixation index, F ws [19, 20]. This index compares within-host diversity (‘w’) to that which exists in the overall local parasite population (or sub-population, ‘s’). It has a possible range from zero (when the sample from an infection contains all possible diversity) through to 1.0 (when the sample from an infection contains no sequence diversity), and this is profoundly influenced by the relative proportions of genotypes in the case of a mixed infection.
Here, the within-host diversity of P. falciparum within a highly endemic population in West Africa has been characterized using two distinct methods. Firstly, microsatellite PCR-based genotyping was performed with a panel of ten loci widely distributed in the parasite genome, and then whole-genome sequence data from the same samples were used to analyse SNPs in order to compute the F ws indices. The relationship between these different types of estimates is examined, and the use of small numbers of SNPs to derive F ws indices is also illustrated, so that the potential value of SNP genotyping may be considered when whole-genome data are not obtainable.
Plasmodium falciparum sample collection and preparation
Patients between the ages of one and 9 years presenting with uncomplicated P. falciparum malaria, as confirmed by rapid diagnostic test (Paracheck, Orchid Biomedical systems, India), were recruited from local health facilities within a 25-km radius of the town of N’Zerekore in the Republic of Guinea, between March and May 2011. Written informed consent was obtained from a parent or guardian of each child sampled, and all patients diagnosed with malaria were treated with artesunate-amodiaquine regardless of study participation. Ethical approval for the sample collection for parasite sequencing and genotypic analysis was obtained from the Comité d’Ethique National pour la Recherche en Santé, République de Guinée (National Ethics Committee for Health Research, Republic of Guinea). Following detection of P. falciparum infection, up to 5 mL of venous blood was collected in EDTA (ethylenediaminetetraacetic acid) vacutainers from consenting subjects and leukocyte depletion of each sample was performed using CF11 cellulose column filtration , and the erythrocytes were then frozen at −20 °C prior to shipment on dry ice to The Gambia. DNA extraction was carried out at the MRC Laboratories in The Gambia, using a QIAamp Blood Midi Kit (Qiagen) and DNA was quantified using a NanoDrop ND-1000 v3.3 spectrophotometer (NanoDrop Technologies, USA). Separate aliquots of DNA from each individual sample were then used for microsatellite genotyping and whole genome sequencing. The illumina paired-end, short-read genome sequencing of these samples has been previously described .
Microsatellite genotyping and analysis of Plasmodium falciparum clinical isolates
Microsatellite genotyping was performed on 95 isolates, out of the 100 for which genome sequence data were separately published . Ten polymorphic microsatellite loci were genotyped using previously described heminested PCR methods , with dye-labelled internal primers as specified for each locus in a recent analysis of other samples from West Africa . The PCR products were run electrophoretically on an ABI 3130XL Genetic analyzer alongside a GeneScan™ 500 LIZ internal size control standard. For each of the microsatellite loci for which multiple product sizes were detected within an isolate, the majority and minor alleles for each isolate were recorded . The number of different genotypes detectable within an isolate was counted as the maximum number of alleles found at any individual microsatellite locus. The proportion of loci that had more than one allele detectable within each isolate was also counted as a separate measure of mixedness. For analyses of population-wide frequencies the majority allele at each microsatellite locus within each isolate was counted.
Genome-wide calculation of the within-isolate fixation index F ws
The whole genome illumina paired-end, short-read sequence data from the P. falciparum clinical isolates genotyped here are published and available at the European Nucleotide Archive as previously described . Quantification of within-host parasite diversity within each individual isolate relative to the overall local population diversity was performed using the F ws metric [19, 20]. For this process, using a set of 50,082 biallelic SNPs, allele frequencies were calculated at every SNP position for each isolate individually, with p and q representing the proportions of read counts for the minor and major alleles. All SNPs were then assigned to ten minor allele frequency (MAF) intervals representing the proportional frequency of the minor allele at each SNP across the Guinean population, with the ten equally sized intervals ranging from 0–5 % up to 45–50 %. Levels of within-host (H w) and local parasite sub-population (H s) heterozygosity for each SNP were calculated as H w = 2*p w*q w and H s = 2*p s*q s. The mean H w and H s of each MAF interval were then computed from the corresponding heterozygosity scores of all SNPs within that particular interval. The resulting—plot of H w against H s for each isolate was produced and a linear regression model was used to determine a value for the gradient H w/H s−, with F ws = 1−(H w/H s). All F ws analyses were performed using custom scripts in R.
Estimation of the F ws index from limited numbers of SNPs
F ws indices were then derived from sampling a small number (between one and 20) of randomly selected SNPs and compared with the genome-wide indices previously determined. A Pearson’s correlation coefficient with the genome-wide SNP estimate of F ws was determined across all isolates. For each limited SNP selection of n SNPs, 100 random samples of n SNPs were analysed and the mean Pearson’s coefficient observed for these was calculated. This analysis was repeated and comparisons performed between sets of SNPs with different ranges of overall minor allele frequencies.
Mixed genotype infections assessed by multi-locus microsatellite analysis
Genotypic mixedness estimated from whole-genome SNP data compared with microsatellite data
Genome-wide SNP data for each of the isolates were used to generate within-host F ws fixation indices, for comparison to the microsatellite profiling of the same isolates. Using the illumina short-read sequence reads to estimate relative frequencies of SNP alleles within each isolate, the within-isolate diversity (H w) was derived, and the gradient of H w/H s (where H s is the diversity in the entire local population sample) was calculated to derive the F ws index for each isolate (with a range from zero indicating maximal possible diversity, to 1.0 indicating no observed diversity within an isolate). Over all 95 clinical isolates, the mean genome-wide F ws index was 0.79, with a range of 0.16 to 1.00 (Additional file 1). Of these, 50 isolates (52.6 %) had an F ws index approaching 1 (values of >0.95) indicating that they each contain a single predominant haploid genome sequence, with any additional genotypes being rare or absent, or closely related to the predominant genotype. Conversely, 45 of the clinical isolates (47.4 %) had an F ws index of <0.95 and were thus clearly genotypically mixed infections. No significant correlation was seen between average genome sequence read mapping depth and F ws index (Pearson’s r = 0.02), indicating that there was no bias in estimating within-host diversity due to sequence coverage.
Identifying SNPs for assay of within-host diversity with a small number of loci
A combination of microsatellite locus typing and genome-wide estimation of SNP-based allele frequencies has been used here to characterize P. falciparum diversity within clinical infections. The results are indicative of a high degree of transmission intensity in the Guinean population studied, and are consistent with previous microsatellite data from other samples taken locally . Interestingly, the genome-wide SNP data here indicate more than half of all infections to each be composed predominantly of single genotypes, whereas microsatellite genotyping detected additional genotypes within infections. Microsatellite typing allows the sensitive detection of distinct parasite genotypes present at low proportions within an infection, although cloning or single-cell analysis of isolates would be needed to estimate the degree of relatedness among the different parasites [4, 18, 24].
Random sampling of the genome-wide SNP data shows that the within-isolate F ws fixation indices may be estimated from modest numbers of SNPs, and correlated with the indices derived from genome-wide data. Therefore, to estimate genotypic mixedness of isolates without whole genome sequencing, it may be feasible to quantitate alleles of between ten and 20 SNPs with other genotyping tools, particularly focusing on SNPs with high overall minor allele frequencies. It is preferable that these SNPs should be neutral, so that estimates are not biased by selection acting on the parasite.
Understanding processes affecting different parasite genotypes within an infection could offer insight into mechanisms that are clinically relevant. Genome sequencing allows broad or deep sampling of diversity within infections [20, 25], but resolution of individual parasite clone genotypes is currently achievable only through either extensive limiting dilution cloning  or single cell genome analysis . Previous studies have shown that proportions of clones in peripheral blood of infected humans varies over time, and can show marked differences between successive days . It is possible that some clones within an infection exist at low proportions due to competitive suppression by other P. falciparum genotypes, or specific selection by the host due to immunity or receptor polymorphisms. Further dissection of the patterns of parasite genotypic diversity in clinical isolates, and possible interactions between genotypes, may lead to novel understanding of malaria parasites which will be relevant for disease control and potential future elimination [6, 27].
This study shows that estimates of genotypic complexity of malaria parasite infections using very different methods give correlated and complementary information. The within-infection fixation index F ws yields a standardized inverse measure (within-infection diversity being 1 − F ws) which may be derived from genome-wide short read sequence data if this is available, or alternatively can be estimated from a modest number of randomly sampled SNPs which could be genotyped by other methods. Multilocus microsatellite PCR-based genotyping gives estimates of infection complexity that correlate strongly with those from the SNP analyses, while also being more sensitive to detect additional genotypes in some infections that appear to have unmixed sequences. With a wide range of methods now available, studies can choose genotyping and analytical approaches to suit investigational goals, recognizing relative advantages of each in relation to the costs and available resources.
VAM and DJC conceived and designed the study. VAM, EL and KML collected the samples and performed laboratory assays. DPK supported data analysis training for VAM and genome sequence data management. LM, VAM, CWD, SAA, and DJC performed data analysis and interpretation. LM, VAM and DJC wrote the manuscript. All authors read and approved the final manuscript.
This study was partly supported by a Biotechnology and Biological Sciences Research Council (BBSRC) Ph.D. studentship to LM, and a Medical Research Council (MRC) Ph.D. studentship to VAM. Field sample collection, laboratory microsatellite genotyping and data analyses were supported by the MRC Gambia Unit, with additional funding from MRC Grant G1100123 and ERC Grant AdG-2011-294428 to DJC. Genome sequencing was conducted as previously published in collaboration with the MalariaGEN consortium (http://www.malariagen.net). Davis Nwakanma and Alfred Amambua-Ngwa provided useful remote advice during sample collection and transport. Sarah Tarr gave very helpful comments and suggestions on the manuscript.
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Anderson TJC, Haubold B, Williams JT, Estrada-Franco JG, Richardson L, Mollinedo R, et al. Microsatellites reveal a spectrum of population structures in the malaria parasite Plasmodium falciparum. Mol Biol Evol. 2000;17:1467–82.View ArticlePubMedGoogle Scholar
- Conway DJ, Roper C, Oduola AMJ, Arnot DE, Kremsner PG, Grobusch MP, et al. High recombination rate in natural populations of Plasmodium falciparum. Proc Natl Acad Sci USA. 1999;96:4506–11.View ArticlePubMedPubMed CentralGoogle Scholar
- Dye C, Williams BG. Multigenic drug resistance among inbred malaria parasites. Proc Biol Sci. 1997;264:61–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Nkhoma SC, Nair S, Cheeseman IH, Rohr-Allegrini C, Singlam S, Nosten F, et al. Close kinship within multiple-genotype malaria parasite infections. Proc Biol Sci. 2012;279:2589–98.View ArticlePubMedPubMed CentralGoogle Scholar
- Paul REL, Packer MJ, Walmsley M, Lagog M, Ranford-Cartwright LC, Paru R, et al. Mating patterns in malaria parasite populations of Papua New Guinea. Science. 1995;269:1709–11.View ArticlePubMedGoogle Scholar
- Escalante AA, Ferreira MU, Vinetz JM, Volkman SK, Cui L, Gamboa D, et al. Malaria molecular epidemiology: lessons from the International Centers of Excellence for Malaria Research Network. Am J Trop Med Hyg. 2015;93:79–86.View ArticlePubMedPubMed CentralGoogle Scholar
- de Roode JC, Pansini R, Cheesman SJ, Helinski ME, Huijben S, Wargo AR, et al. Virulence and competitive ability in genetically diverse malaria infections. Proc Natl Acad Sci USA. 2005;102:7624–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Pollitt LC, Mideo N, Drew DR, Schneider P, Colegrave N, Reece SE. Competition and the evolution of reproductive restraint in malaria parasites. Am Nat. 2011;177:358–67.View ArticlePubMedPubMed CentralGoogle Scholar
- Reece SE, Drew DR, Gardner A. Sex ratio adjustment and kin discrimination in malaria parasites. Nature. 2008;453:609–14.View ArticlePubMedGoogle Scholar
- Santhanam J, Raberg L, Read AF, Savill NJ. Immune-mediated competition in rodent malaria is most likely caused by induced changes in innate immune clearance of merozoites. PLoS Comput Biol. 2014;10:e1003416.View ArticlePubMedPubMed CentralGoogle Scholar
- Klein EY, Smith DL, Laxminarayan R, Levin S. Superinfection and the evolution of resistance to antimalarial drugs. Proc Biol Sci. 2012;279:3834–42.View ArticlePubMedPubMed CentralGoogle Scholar
- Gething PW, Patil AP, Smith DL, Guerra CA, Elyazar IR, Johnston GL, et al. A new world malaria map: Plasmodium falciparum endemicity in 2010. Malar J. 2011;10:378.View ArticlePubMedPubMed CentralGoogle Scholar
- Farnert A, Arez AP, Babiker HA, Beck HP, Benito A, Bjorkman A, et al. Genotyping of Plasmodium falciparum infections by PCR: a comparative multicentre study. Trans R Soc Trop Med Hyg. 2001;95:225–32.View ArticlePubMedGoogle Scholar
- Anthony TG, Conway DJ, Cox-Singh J, Matusop A, Ratnam S, Shamsul S, et al. Fragmented population structure of Plasmodium falciparum in a region of declining endemicity. J Infect Dis. 2005;191:1558–64.View ArticlePubMedGoogle Scholar
- Machado RL, Povoa MM, Calvosa VSP, Ferreira MU, Rossit ARB, dos Santos EJM, et al. Genetic structure of Plasmodium falciparum populations in the Brazilian Amazon region. J Infect Dis. 2004;190:1547–55.View ArticlePubMedGoogle Scholar
- Mobegi VA, Loua KM, Ahouidi AD, Satoguina J, Nwakanma DC, Amambua-Ngwa A, et al. Population genetic structure of Plasmodium falciparum across a region of diverse endemicity in West Africa. Malar J. 2012;11:223.View ArticlePubMedPubMed CentralGoogle Scholar
- Schultz L, Wapling J, Mueller I, Ntsuke PO, Senn N, Nale J, et al. Multilocus haplotypes reveal variable levels of diversity and population structure of Plasmodium falciparum in Papua New Guinea, a region of intense perennial transmission. Malar J. 2010;9:336.View ArticlePubMedPubMed CentralGoogle Scholar
- Nair S, Nkhoma SC, Serre D, Zimmerman PA, Gorena K, Daniel BJ, et al. Single-cell genomics for dissection of complex malaria infections. Genome Res. 2014;24:1028–38.View ArticlePubMedPubMed CentralGoogle Scholar
- Auburn S, Campino S, Miotto O, Djimde AA, Zongo I, Manske M, et al. Characterization of within-host Plasmodium falciparum diversity using next-generation sequence data. PLoS One. 2012;7:e32891.View ArticlePubMedPubMed CentralGoogle Scholar
- Manske M, Miotto O, Campino S, Auburn S, Almagro-Garcia J, Maslen G, et al. Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing. Nature. 2012;487:375–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Venkatesan M, Amaratunga C, Campino S, Auburn S, Koch O, Lim P, et al. Using CF11 cellulose columns to inexpensively and effectively remove human DNA from Plasmodium falciparum-infected whole blood samples. Malar J. 2012;11:41.View ArticlePubMedPubMed CentralGoogle Scholar
- Mobegi VA, Duffy CW, Amambua-Ngwa A, Loua KM, Laman E, Nwakanma DC, et al. Genome-wide analysis of selection on the malaria parasite Plasmodium falciparum in West African populations of differing infection endemicity. Mol Biol Evol. 2014;31:1490–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Anderson TJC, Su X-Z, Bockarie M, Lagog M, Day KP. Twelve microsatellite markers for characterisation of Plasmodium falciparum from finger prick blood samples. Parasitology. 1999;119:113–25.View ArticlePubMedGoogle Scholar
- Anderson TJ, Nair S, Nkhoma S, Williams JT, Imwong M, Yi P, et al. High heritability of malaria parasite clearance rate indicates a genetic basis for artemisinin resistance in western Cambodia. J Infect Dis. 2010;201:1326–30.View ArticlePubMedPubMed CentralGoogle Scholar
- Bailey JA, Mvalo T, Aragam N, Weiser M, Congdon S, Kamwendo D, et al. Use of massively parallel pyrosequencing to evaluate the diversity of and selection on Plasmodium falciparum CSP T-cell epitopes in Lilongwe, Malawi. J Infect Dis. 2012;206:580–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Farnert A, Lebbad M, Faraja L, Rooth I. Extensive dynamics of Plasmodium falciparum densities, stages and genotyping profiles. Malar J. 2008;7:241.View ArticlePubMedPubMed CentralGoogle Scholar
- Nkhoma SC, Nair S, Al-Saai S, Ashley E, McGready R, Phyo AP, et al. Population genetic correlates of declining transmission in a human pathogen. Mol Ecol. 2013;22:273–85.View ArticlePubMedPubMed CentralGoogle Scholar