Genetic diversity and population structure of Plasmodium falciparum in Nigeria: insights from microsatellite loci analysis

Background Malaria remains a public health burden especially in Nigeria. To develop new malaria control and elimination strategies or refine existing ones, understanding parasite population diversity and transmission patterns is crucial. Methods In this study, characterization of the parasite diversity and structure of Plasmodium falciparum isolates from 633 dried blood spot samples in Nigeria was carried out using 12 microsatellite loci of P. falciparum. These microsatellite loci were amplified via semi-nested polymerase chain reaction (PCR) and fragments were analysed using population genetic tools. Results Estimates of parasite genetic diversity, such as mean number of different alleles (13.52), effective alleles (7.13), allelic richness (11.15) and expected heterozygosity (0.804), were high. Overall linkage disequilibrium was weak (0.006, P < 0.001). Parasite population structure was low (Fst: 0.008–0.105, AMOVA: 0.039). Conclusion The high level of parasite genetic diversity and low population structuring in this study suggests that parasite populations circulating in Nigeria are homogenous. However, higher resolution methods, such as the 24 SNP barcode and whole genome sequencing, may capture more specific parasite genetic signatures circulating in the country. The results obtained can be used as a baseline for parasite genetic diversity and structure, aiding in the formulation of appropriate therapeutic and control strategies in Nigeria. Supplementary Information The online version contains supplementary material available at 10.1186/s12936-021-03734-x.

of genetic diversity, transmission intensity, and parasite population structure in Nigeria-the most malaria burdened country-is essential if the goal of malaria control or elimination is to be achieved.
Molecular techniques play important roles in the analyses of genetic diversity, transmission dynamics, and population structure of P. falciparum field isolates. Early molecular studies focused mostly on the use of polymorphic markers, such as merozoite surface protein 1 (msp-1), merozoite surface protein 2 (msp-2) and glutamate-rich protein (glurp) to characterize P. falciparum genetic diversity and structure in Nigeria [7][8][9]. These markers were also useful in monitoring drug efficacy with regards to classification of recurrent P. falciparum parasitaemia as re-infection or recrudescent infection [6,10,11]. However, there have been contrasting reports of polymorphisms in MSP-1 and MSP-2 in earlier studies in Nigeria [6,[12][13][14], which is associated with the fact that these antigenic markers are often under intense immune pressure [15][16][17]. The genotyping results provided by these markers can, therefore, potentially lead to a masked and distorted view of the population structure and transmission patterns which may account for observed variations across parasite populations circulating in a given environment [6].
Microsatellite loci have been suggested to be better alternatives to msp-1, msp-2 and glurp due to their abundance, putative neutrality and higher levels of polymorphisms [18]. This molecular technique remains one of the most efficient and reliable methods for analyzing the genetic diversity of falciparum populations for epidemiological and drug efficacy purposes within countries and across continents [19]. In past studies of using microsatellite analyses, it was observed that parasites from areas of low malaria transmission [19] (< 1% infection) have less genetic diversity but more population structure and greater linkage disequilibrium (i.e., more non-random association among alleles across multiple loci) [4,[19][20][21]. Contrary, in regions of high malaria transmission, individuals are more likely to be infected by more than one P. falciparum parasite thereby resulting in an increase in the rate of recombination and subsequently, highly diverse population with low linkage disequilibrium [18,19,22]. Although, some studies report a deviation from the norm whereby high levels of heterozygosity (a measure of genetic diversity) is observed in several low transmission countries [18,23,24]. This suggests that a high level of heterozygosity may reflect past human demographic processes as opposed to recent epidemiological factors [25].
The objective of this study was to investigate the genetic diversity of circulating P. falciparum parasites and their population structures in Nigerian children 6-96 months old with uncomplicated infections, treated with artemisinin-based combination therapy (ACT).

Study population
Children aged 6-96 months old were eligible for enrollment in the efficacy study if they had symptoms compatible with uncomplicated malaria such as fever, anorexia, vomiting or abdominal discomfort with or without diarrhea with P. falciparum infections.

Sample collection
Filter papers containing dried blood spots (DBS) obtained from 633 children confirmed as malaria positive by microscopy were randomly selected for this study. All DBS samples were collected in 2014 (Adamawa, Bayelsa, Imo, Kwara, Oyo and Sokoto) and 2018 (Enugu, Kano and Plateau). Samples were collected for a duration of three months (July-September) which represents intense malaria transmission season in Nigeria. Two to three drops of finger-pricked blood samples were blotted on 3 mm Whatman filter paper (Whatman International Limited, Maidstone, UK) before treatment initiation (Day 0). The blood samples impregnated on to filter papers were allowed to air-dry properly at room temperature, and DBS were kept in airtight envelopes with silica gel at room temperature until analysed.

DNA extraction
DNA was extracted from DBS for parasite genetic diversity and population structure studies as previously described [26]. DNeasy Blood and Tissue extraction kit (Qiagen, Germany) was used to extract parasite DNA from DBS following the manufacturer's protocol.

Plasmodium falciparum genotyping by microsatellite loci analysis
Semi-nested PCR amplification of 12 P. falciparum microsatellite loci was done using a previously described protocol [17]. The 12 microsatellite loci were Poly A, PfG377, TA81, ARA2, TA87, TA40, TA42, 2490, TA1, TA60, TA109 and PfPk2 [27]. FAM, YAK YELLOW, and ATTO550N-labeled PCR products for the different loci amplified were pooled together for electrophoresis on the ABI 3500XL Genetic Analyzer at the African Centre of Excellence for the Genomics of Infectious Diseases (ACEGID), Redeemer's University Ede, Osun State, Nigeria. Peakscanner (Applied Biosystems) and GeneMarker (Softgenetics) software were used for normalization across runs and automatic determination of allele length and peak heights in samples containing multiple alleles per locus.

Data analysis
Microsatellite data was retrieved from the Genetic Analyzer 3500XL and formatted in Microsoft excel (version 16.44) as previously described [28]. Subsequent genetic analysis was only done on samples where all microsatellite markers were successfully amplified. Multiple alleles at a given locus was assumed if minor peaks observed were more than 20% the height of the predominant peak [6]. Although minor alleles were scored in samples containing them, only the predominant alleles were considered for all population genetic and structure analysis. Multi-clone infections were defined as those that had at least two loci containing multiple alleles (only samples with two alleles were included for analysis), while single clone infections were defined as those containing one allele for all microsatellite loci or when one locus contained multiple alleles [6]. Haplotypes were computed using ARLEQUIN software version 3.11 [29] from both single-clone and multi-clone infections. The predominant allele at each locus was used to define twelve-locus parasite haplotype in multiple-clone infections [30].

Measures of parasite genetic diversity Number of effective alleles (Ne) and number of different alleles (Na)
The number of effective alleles (Ne) and number of different alleles (Na) were computed per locus for each State involved in the study using GENALEX 6.5 [28].

Allelic richness
Allelic richness (Ar) was computed using FSTAT (v 3.1) as the average number of alleles per locus [31].

Expected heterozygosity
The expected heterozygosity (He), a measure of parasite genetic diversity, represents the probability of being infected by two parasites with different alleles at a given locus, was calculated using ARLEQUIN software version 3.11 [29] with the formula: where n is the number of isolates analysed, and p represents the frequency of each different allele at a locus [29].
The He values range from 0 to 1. Values closer to 0 indicate little or no genetic diversity while values closer to 1 indicate high genetic diversity.

Analysis of molecular variance (AMOVA)
Inter-and intra-population variance was determined with analysis of molecular variance (AMOVA, i.e., ΦPT). ΦPT value of zero (0) is considered indicative of no genetic differentiation among populations.

Fixation index (Fst)
The population divergence was measured by calculating the fixation index (Fst) for all pairs of parasite population in each State using the GENALEX 6.5 software. An Fst value between 0-0.05 was classified as little genetic differentiation, 0.05-0.15: moderate genetic differentiation, 0.15-0.25: great genetic differentiation and values greater than 0.25 represented very great genetic differentiation [32].

Cluster analysis
A Bayesian model implemented in the program STRU CTU RE v2.3 [33] was used to determine the number of populations or genetic clusters present in Nigerian States considered in this study. A linked model with admixture was used with 5 replicates for each value of k (from 2 to 6), and a burn-in period of 50,000 iterations of Monte Carlo Markov chains [34]. To obtain the optimal number of genetic populations, estimation of ΔK described by Evanno analysis was done using Structure Harvester [35].

Linkage disequilibrium (LD)
Multilocus linkage disequilibrium measured as the standardized index of association (I S A) was calculated using the program LIAN version 3.5 [36] for the whole dataset and a data-subset with haplotypes from only confirmed single-clone infections, as a precaution against the bias that may result from presence of any false dominant haplotypes [37]. This index was calculated as: (2) where VE is the expected variance of the nth number of loci for which two individuals differ. VD is the observed variance. Randomization test was done to determine whether the ratio of VD/VE was significantly higher than 1.

Demographics and baseline characteristics
Overall, 329 (51.97%) were male and the mean age of all children included in the study was 48.4 ± 15.8 months. Also, mean enrollment body temperature was 37.5 ± 2.5 °C. Overall geometric mean asexual parasitaemia was 16,219 μL −1 (range: 200).

Parasite genetic diversity
Of the 633 samples considered for analysis, microsatellite amplification was successful in 571 (90.2%). Most (67.1%) samples were multi-clone infections. There were as many haplotypes as the isolates fully genotyped in the dataset (i.e., all haplotypes were unique). The mean (computed as the average of the sum of values from each locus) of different alleles (Na), effective   (Table 1). Although genetic diversity was high in all populations, it was observed that Ar was especially high in parasite populations from two States (Enugu and Kano) obtained in 2018 (Fig. 1).
The distribution of Na, Ne, Ar and He per microsatellite loci across the nine States are represented in Additional files 1, 2, 3. Kruskal-Wallis test showed no significant difference between the Ar values observed across the nine States (P > 0.05). This was also observed in He and Ne values (P > 0.05).

Parasite population differentiation
Non-random associations among loci (multilocus LD) were measured for all complete haplotypes and also those from single infections by calculating the Index of Association (IA S ). In both the complete data set (Multi-clone and single-clone infection) and the sub-data set (singleclone infection), LD values obtained in parasite populations from Enugu, Kano, Sokoto and Plateau States were significant (P < 0.01) ( Table 2). It is observed that three of the four parasite populations with significant LD were from 2018 (i.e., Enugu, Kano and Plateau) while just one (Sokoto) was from 2014. Pairwise genetic differentiation (Fst) among study sites ranged from low to moderate ( Table 3). The lowest genetic differentiation was observed between Imo (South-East 1) and Kwara (North Central 1) 0.008 while the largest genetic differentiation was observed between Sokoto (North-West 1) and Oyo (South West) 0.105 (this still represents moderate genetic differentiation) ( Table 3).
The AMOVA result (0.039) further confirmed the low parasite genetic differentiation as just 3.9% of genetic variation were observed between study sites. Furthermore, cluster analysis using STRU CTU RE confirmed low population differentiation as only three putative parasite clusters; CL 1, CL 2 and CL 3 (ΔK = 14.73) were identified as admixtures in the nine States (Fig. 2). Furthermore, the majority of parasite populations from 2014 (Adamawa, Bayelsa, Imo, Sokoto and Kwara) were in the first cluster (blue) with the exception of Oyo (Green). While in 2018,  two clusters were observed (Red and Green). In Enugu, most parasites were in the third cluster (green) while in Plateau, most parasites were in the second cluster (red). Parasites from Kano State clustered almost evenly between both cluster 2 and 3.

Discussion
Nigeria remains the country with the highest global malaria burden. Hence, molecular studies on P. falciparum diversity and population structure become essential in monitoring the impact of different intervention strategies in the control of malaria transmission. This study employed the use of 12 microsatellite loci to evaluate P. falciparum genetic diversity and population structure in nine Nigerian States. Although microsatellite are better alternatives to polymorphic markers, such as msp-1, msp-2, and glurp, there are only a few reports of its use in studies conducted in Nigeria [6]. Analysis of the microsatellite data generated in this study revealed high parasite genetic diversity across all States. For instance, it was observed that the mean Ne (computed as the average of the sum of Ne values across the 12 microsatellite loci) in the nine States, ranged from 6.2-8.2. This is expected because the number of Ne detected per locus is likely to be high in areas with high malaria endemicity and vice versa [5,19]. As such, this study's Ne values were comparable to those reported in other high-endemic regions of Sub-Saharan Africa [4,5,38]. Similarly, values of allelic richness (Ar) obtained in this study further confirmed the high level of parasite genetic diversity in all nine States (range: 7.09-14.27) and identical to those reported in other malaria endemic countries [39]. Furthermore, computed expected heterozygosity (He) values observed in all nine States were high ranging from 0.776-0.842. This is similar to those reported in other malaria endemic countries [5,6,18,40,41]. This further emphasizes the earlier conclusion from this study, high parasite genetic diversity and parasite transmission within the country [42]. It is equally important to note that when samples were stratified as Northern States (Adamawa, Kano, Kwara, Plateau and Sokoto) and Southern States (Bayelsa, Enugu, Imo, and Oyo) in a bid to investigate the possible influence of geographic location of observed parasite diversity, there was no significant difference in measures of genetic diversity i.e., Ne, He, and Ar values (P > 0.05). This is equally expected as malaria endemicity continues to be high across the country.
Although parasite genetic diversity was high, further analysis of microsatellite data revealed low parasite population differentiation. It was observed that when all nine States were considered as a single population, the overall association index was 0.0065 (P < 0.01), which is weaker than those typically reported in regions with low transmission [21,23]. Studies have associated low LD values such as those reported in this study, to high levels of malaria transmission; which leads to increased crossbreeding and meiotic recombination that results in LD breakdown [5,6,19,43]. The pairwise genetic differentiation (Fst) among study sites showed low to moderate genetic variation (0.008-0.105; P < 0.001). This is similar to what was reported earlier in Nigeria [6]. Furthermore, the observed low population differentiation was confirmed by AMOVA (0.039). This implies that only about 3.9% of genetic differentiation exist amongst the nine States investigated. Cluster analysis also showed that only three parasite clusters exist amongst all the nine States. However, the majority of parasite populations from 2014 (Adamawa, Bayelsa, Imo, Sokoto and Kwara) were in the first cluster (blue) with the exception of Oyo (Green). While in 2018, parasites from Enugu were in the third cluster (green), those from Plateau were in the second cluster (red) and those from Kano were distributed between cluster 2 and 3. Although, samples analysed in this work were representative of the country as a whole, a major limitation was that samples from each State were only collected at a single time point (i.e., either 2014 or 2018) thus, a spatio-temporal analysis could not be done. This perhaps would have provided more insights into the variations in clustering patterns observed in this study.
In summary, it has been observed that parasites from areas of low malaria transmission [19] (< 1% infection) show less genetic diversity, more population structure and greater linkage disequilibrium [4,[19][20][21]. In this study, the contrary has been observed i.e., high genetic diversity, low population structure and weak linkage disequilibrium. This is typical in regions of high malaria transmission, as individuals are more likely to be infected by more than one P. falciparum parasite thereby resulting in an increase in the rate of recombination and subsequently, high diverse population with low linkage disequilibrium [18,19,22]. It is plausible that the low to moderate genetic differentiation between States observed is as a result of immense human migration between these populations as part of the usual socioeconomic activities and indiscriminate vector migration within the country [6,37,44,45].

Conclusion
This study represents the first use of 12 microsatellite loci to characterize parasite genetic diversity and structure in Nigeria across regions representing all the six geographical zones of the country. The high level of parasite genetic diversity and low population structuring in this study suggests that parasite transmission is high and circulating parasites may be homogenous. However, higher resolution methods such as the 24 SNP barcode and whole genome sequencing may capture more specific parasite genetic signatures circulating in the country. The results obtained in this study can be used as a baseline for parasite genetic diversity and structure, aiding in the formulation of appropriate therapeutic and control strategies in Nigeria.