Genetic diversity in the block 2 region of the merozoite surface protein-1 of Plasmodium falciparum in central India

Background Malaria continues to be a significant health problem in India. Several of the intended Plasmodium falciparum vaccine candidate antigens are highly polymorphic. The genetic diversity of P. falciparum merozoite surface protein-1 (MSP-1) has been extensively studied from various parts of the world. However, limited data are available from India. The aim of the present study was a molecular characterization of block 2 region of MSP-1 gene from the tribal-dominated, forested region of Madhya Pradesh. Methods DNA sequencing analysis was carried out in 71 field isolates collected between July 2005 to November 2005 and in 98 field isolates collected from July 2009 to December 2009. Alleles identified by DNA sequencing were aligned with the strain 3D7 and polymorphism analysis was done by using Edit Sequence tool (DNASTAR). Results The malaria positivity was 26% in 2005, which rose to 29% in 2009 and P. falciparum prevalence was also increased from 72% in 2005 to 81% in 2009. The overall allelic prevalence was higher in K1 (51%) followed by MAD20 (28%) and RO33 (21%) in 2005 while in 2009, RO33 was highest (40%) followed by K1 (36%) and MAD20 (24%). Conclusions The present study reports extensive genetic variations and dynamic evolution of block 2 region of MSP-1 in central India. Characterization of antigenic diversity in vaccine candidate antigens are valuable for future vaccine trials as well as understanding the population dynamics of P. falciparum parasites in this area.


Background
Madhya Pradesh (MP) is situated in the central part of India, and is a highly malarious state contributing 9% of all malaria cases in the country [1]. Plasmodium falciparum infection has dramatically increased in MP in recent years and is associated with life-threatening complications in both children and adults [2,3].
The merozoite surface protein-1 (MSP-1) is a leading vaccine candidate antigen. It is the most abundant surface protein on the blood stage of P. falciparum, and it is thought to play a role in erythrocyte invasion [4]. The primary structure of MSP-1 is polymorphic and 40% of the amino acid residues are different in different allelic forms in P. falciparum [5]. The precursor of MSP-1 is a protein comprising 1,720 amino acids, including a 20-amino-acid signal sequence (SS) and a signal for anchoring the protein at the cellular surface via a GPI moiety (GA). MSP-1 divided into 17 blocks, which were either variable, conserved or semi-conserved [6,7]. Sequences of blocks 1, 3, 5, 12 and 17 th are conserved, and blocks 2, 4, 6, 8, 10, 14 and 16 diverge extensively while in the remaining blocks 7, 9 11, 13 and 15 are semi-conserved. Variations in the sequences are dimorphic in nature with the exception of polymorphic tripeptide encoding region in block 2.
The block 2 region includes three allele families: K1, MAD20, and RO33. Alleles in K1 and MAD20 contain antigenically unique, tripeptide repeats, with extensive diversity in the number of repeats [7]. RO33 lacks the tripeptide repeats observed in the other two families; however, outside block 2, this allele is similar to the MAD20 type [8]. Fragment size in the three block 2 allele families has commonly been used as a molecular marker in studies of malaria transmission dynamics and host immunity in P. falciparum malaria [9][10][11][12][13]. The protective immune responses have also been observed against the motifs present in the major allele families of block 2 and while the evidence suggests that the allele families are maintained by selection, it is not clear how selection operates against the number of tandem repeats [14][15][16].
The purpose of this study was to explore the extent of genetic variation in MSP-1 block 2 over the years in central India for studying as a molecular marker in epidemiologic investigations, malaria transmission dynamics and finally help in vaccine design under selection pressure.

Study sites
The present study was carried out in Baigachak area of Dindori district, Madhya Pradesh, India (Figure 1), from July 2005 to November 2005 and July 2009 to December 2009 during peak transmission season. It is a highly malarious district in the State of Madhya Pradesh with a very high transmission rate [17]. Patients ranging between one and 59 years of age presenting with fever and symptoms of P. falciparum malaria were screened for malaria parasites after obtaining consent. Fever history was obtained from the patient or by an accompanying person (in the case of children). Physical examination of the patients was performed and axillary temperature recorded.

Sample collection
Parasitologic surveys were carried out to collect blood smears from all fever cases and cases with history of fever. Blood smears were stained with Jaswant Singh, and Bhattacharji (JSB) stain examined under light microscope for Plasmodium species identification [18].
Three to five drops of finger prick blood was blotted on 3 MM filter paper (Whatman) to study genetic diversity of MSP-1. Consent from the study subjects was taken before collection the blood samples.

DNA isolation from filter paper
Blood spotted area was punched and put into a 1.5 ml tube. The blood spot were soaked in 150 μl TE buffer (10 mM Tris, 0.1 mM EDTA, pH 8.0) and incubated for an hour at RT. After one hour incubation, tubes were placed in dry bath at 50°C and incubated for 15 minutes and punched by pipettes tips several times. Finally the tubes were incubated at 97°C for 15 minutes and centrifuged at 8,000 rpm for 2 minutes. Supernatant was aspirated and stored at -20°C for PCR amplification.

PCR amplification of the msp1 gene
The primary PCR was set up for the amplification of block 2 region by using the primers MSP1A (forward): 5'-CACAATGTGTAACACATGAAAG-3' and MSP1B (reverse): 5'-AGTACGTCTAATTCATTTGCAC -3'. The 646 bp primary PCR product was diluted 1:10 and was used for the nested PCR. A nested PCR of a 555 bp product was amplified by using the primers MSP1C(forward): 5' -TAGAAGCTTTAGAAGATGCAG-3' and MSP1 D(reverse): 5' GACAATAATCATTAGCACA-TAC 3'and sequenced. The primary PCR was performed in a volume of 20 μL with 0.175 U of Taq DNA polymerase, 0.2 mM each dNTP, 0.4 μM each primer, and 1 mM MgCl 2 . The reaction was allowed to proceed for 35 cycles after an initial denaturation at 94 C for 1 minute, annealing at 55 C for 1 minute, and extension at 72 C for 1 minute. Final extension was at 72 C for 10 minutes. The nested PCR was performed with annealing at 53 C for 25 cycles. Other nested PCR conditions were the same as those described for the primary PCR [17]. The PCR products were resolved on a 2% agarose gel.

Nucleotide sequencing
The PCR products were purified from the agarose gel by using HyYeld™ gel/PCR DNA extraction kit (Real Biotech Corp., Teipei Country, Taiwan), as per the manufacturer's recommended protocol. From 200 to 250 ng of the gel-purified product was used with the ABI Big Dye Terminator Ready Reaction Kit Version 3.1 (PE Applied Biosystems Foster City, CA 94404, USA) for the sequencing PCR. The sequencing PCR was performed in a volume of 20 μL with 1 μL to Terminator Ready Reaction Mix (TRR), 3.2 pmol of gene specific primer MSP1C (555 bp of block 2 region) and 0.5X sequencing buffer. Cycling conditions for the sequencing PCR included 25 cycles of denaturation at 96 C for 10 seconds, annealing at 50 C for 5 seconds, and extension at 60°CC for 4 minutes. Templates were purified and sequenced on an ABI Prism 310 Genetic Analyzer (PE Applied Biosystems).

Sequence analysis
Sequence obtained was translated using the Edit Sequence tool (DNASTAR). The translated sequences were then aligned using the MEGALIGN program (DNASTAR, INC., Madison, WI). Nucleotide sequences are submitted to the GenBank database.
The expected heterozygosity was calculated by use of the formula H E = n/ (n − 1) × 1 − pi 2 , where n is the number of samples and pi the frequency of allele i. H E is the probability that two alleles randomly drawn from the population sample are different. The mean multiplicity of infection (MOI) was calculated as the total number of clones divided by the number of positive samples for marker gene. Allele frequencies were further compared between two year populations and P value was calculated for significance.

Ethical approval
The study was approved by the Scientific Advisory Committee, Ethical Committee of Regional Medical Research Centre for Tribals, Jabalpur, MP, India and informed consent and human subjects guidelines were followed.

Results
The overall malaria positivity was 26% in 2005, which rose to 29% in 2009 (Table 1)  Comparison of the sequences showed that all these isolates belong to one of these three alleles. The overall allelic prevalence was recorded which was higher in K1 (51%) followed by MAD20 (28%) and RO33 (21%) in 2005 while in 2009 RO33 was highest (40%) followed by K1 (36%) and MAD20 (24%).
In the block 2 of MSP1, the nucleotide and the deduced amino acid sequence were found to be highly polymorphic among the isolates. All the nucleotide changes in these isolates were non-synonymous, as a result, the deduced amino acid variations corresponded to one or other allele. A total 22 types of variants were found in the K1 type alleles in 2005 and 21 types of variants in 2009(Additional file 1: Figure S1). Out of these 21 variants only seven belong to 2005 types and the remaining 14 were new variants ( Figure 2). MAD20 type of allelic had limited 11 variants in 2005 while in 2009 total 17 variants were found (Additional file 1: Figure  S2

Discussion
The genetic diversity of P. falciparum, msp1genes was investigated from the field isolates of high transmission area over the five-year periods from central India. Of the 334 isolates, MSP1 sequencing was successfully completed in 169 isolates.
Most malaria vaccine-candidate antigens are highly polymorphic surface proteins that elicit variant specific immunity. Therefore, the evolutionary relationships could be explored for the design of vaccines based on ancestral sequences, with the potential for including cross-protection against a wide range of antigenic variants. Thus, the understanding of mechanisms and patterns of genetic recombination and sequence variation may help in designing a vaccine that represents the worldwide repertoire of polymorphic malaria surface antigens.
The MSP-1, with numerous alleles and differing in the length of the genes, have been extensively studied and their genetic polymorphisms were used to describe clonality of infections in a large number of studies. Length variability in MSP families is mainly results from repeat sequences. The alleles of MSP-1 belong to the allelic groups K1, MAD20 and RO33 with high variability when comparing the groups, but less variability within them. Minor amino acid diversity is created in malaria parasite antigens by single-nucleotide replacement. Dependent on the degree of amino acid substitution (highly variable, semi-conserved, conserved), MSP-1 has been categorized into 17 blocks [6]. Genetic recombinations account for most variation seen in malarial antigens, since it occurs in several orders of magnitude more frequently than mutation [19]. Differential  prevalence of dimorphic MSP1 epitopes had previously been reported by Conway et al. from West Africa (Gambia and Nigeria) and Brazil (eastern Amazon) [20]. Block 2 is of particular interest, as it exhibits repetitive tri-nucleotides and appears to be subjected to rapid intragenic recombination process, comparable to those of the csp gene. It has been shown that IgG antibodies are important in acquired anti-malarial immunity against the most frequent subtypes of block 2 of MSP-1 [15].
Previous studies of block 2 of MSP1 allelic types from India yielded information of varying patterns of diversity [21][22][23]. In the present study, results reveal that K1 alleles are dominant (51%) followed by MAD20 (28%) and rest were RO33 type alleles in 2005 samples, in contrast to a Colombia study in 1990, where only MAD20 and RO33 type alleles were reported and the K1 type alleles were missing. In Iran only limited numbers of K1 (7.6%) type alleles were reported by Mehrizi et al. [24]. Mahajan et al. reported all three types of alleles from India [21]. In the Zambian isolates of MSP1 from block2, 54% K types alleles, 35% RO33 and 11% of MAD20 like alleles were reported [5]. Analysis of 2009 samples in present study showed 36% K types alleles, 40% RO33 and 24% of MAD20-like alleles.
In another study carried out in Choea (north-west Colombia) in 1997, all three (MSP1 block 2) allelic types were detected although MAD20 was the predominant allele and K1 was less frequent [25]. The maximum variation in this study (22 variants [23]. In the present finding, RO33 type alleles were semi-conserved and only two variants were found in 2005 samples and increased to nine by 2009. The level of antigenic diversity of P. falciparum populations in an area is likely to affect acquisition of immunity to malaria. Substantial variations in the prevalence of block 2 alleles during different study period indicates dynamic nature of msp1 genetic structure in P. falciparum populations. It is possible that acquisition of strain specific immunity may modulate the selection of different allelic variants and this may be one of the explanation for the observed findings. Sequence analysis of the present study identified numerous novel alleles and specific motif arrangements of msp1, block 2 allele sequences as reported by Escalante et al. [26]. Further genetic polymorphism appear to evolve faster in the higher transmission areas when compared to lower transmission areas [27,28]. The degree of polymorphism found in the present study is also consistent with the high level of transmission of malaria in the study area as reported previously by Singh et al., [29].
High level of MOI observed in this study fits with previous observations of an increased complexity of infection with increasing endemicity [30]. Over all findings from this study indicates dynamic evolution of variation in the msp1 gene of P. falciparum in the study area and it could serve as a good marker in studying the P. falciparum population in this region. In addition, extensive genetic variation in the block 2 region of MSP-1 makes it as useful genetic markers in differentiating parasite strains in clinical trials in this region.

Conclusion
The present study reports extensive genetic variations and dynamic evolution of block 2 region of MSP-1 in Central India. Characterization of antigenic diversity in vaccine candidate antigens are valuable for future vaccine trials as well understanding the population dynamics of P. falciparum parasites in this area.

Additional material
Additional file 1: Figure S1. Amino acid sequence alignment of the K1 allelic types of Plasmodium falciparum msp1 gene from central India. Figure S2. Amino acid sequence alignment of the MAD20 allelic types of Plasmodium falciparum msp1 gene from central India. Figure S3. Amino acid sequence alignment of the RO33 allelic types of Plasmodium falciparum msp1 gene from central India.