- Open Access
Genetic diversity of circumsporozoite protein in Plasmodium knowlesi isolates from Malaysian Borneo and Peninsular Malaysia
Malaria Journal volume 19, Article number: 377 (2020)
Understanding the genetic diversity of candidate genes for malaria vaccines such as circumsporozoite protein (csp) may enhance the development of vaccines for treating Plasmodium knowlesi. Hence, the aim of this study is to investigate the genetic diversity of non-repeat regions of csp in P. knowlesi from Malaysian Borneo and Peninsular Malaysia.
A total of 46 csp genes were subjected to polymerase chain reaction amplification. The genes were obtained from P. knowlesi isolates collected from different divisions of Sabah, Malaysian Borneo, and Peninsular Malaysia. The targeted gene fragments were cloned into a commercial vector and sequenced, and a phylogenetic tree was constructed while incorporating 168 csp sequences retrieved from the GenBank database. The genetic diversity and natural evolution of the csp sequences were analysed using MEGA6 and DnaSP ver. 5.10.01. A genealogical network of the csp haplotypes was generated using NETWORK ver. 188.8.131.52.
The phylogenetic analysis revealed indistinguishable clusters of P. knowlesi isolates across different geographic regions, including Malaysian Borneo and Peninsular Malaysia. Nucleotide analysis showed that the csp non-repeat regions of zoonotic P. knowlesi isolates obtained in this study underwent purifying selection with population expansion, which was supported by extensive haplotype sharing observed between humans and macaques. Novel variations were observed in the C-terminal non-repeat region of csp.
The csp non-repeat regions are relatively conserved and there is no distinct cluster of P. knowlesi isolates from Malaysian Borneo and Peninsular Malaysia. Distinctive variation data obtained in the C-terminal non-repeat region of csp could be beneficial for the design and development of vaccines to treat P. knowlesi.
Malaria in humans is caused by Plasmodium species, including Plasmodium vivax, Plasmodium ovale spp., Plasmodium malariae, Plasmodium falciparum, and Plasmodium knowlesi. Plasmodium parasites also infect crab-eating macaque (Macaca fascicularis), also known as the long-tailed macaque, and southern pig-tailed macaque (Macaca nemestrina). Thus, malaria is considered an emerging zoonotic disease. Humans may acquire these parasites when they are close to the habitats of macaques infected with the parasites through anopheline mosquito vectors in forests. Most recently, the World Malaria Report 2019 estimates that 228 million new malaria cases and 405,000 deaths from malaria occurred around the world in 2018 . These data are at odds with the vision set by the World Health Organization in early 2015, which has a goal of at least a 90% reduction of the global malaria incidence and mortality rates by 2030 .
Malaysia is vulnerable to malaria transmission since it is located in a hot and humid equatorial region. It is estimated that about 1.26 million people in Malaysia are living in hyperendemic areas and have high risk of contracting malaria . Previously, there has been a large focus on molecularly validated P. knowlesi infections in humans that were misidentified as P. malariae by microscopy observations due to their similar morphological characteristics [3, 4]. Since then, human infections by P. knowlesi have been reported in Malaysia [5,6,7], and increased detection of the infections in the country has recently been documented [8,9,10,11]. Besides Malaysia, P. knowlesi infections in humans have also been reported in other Southeast Asian countries . Therefore, malaria should be prevented or controlled in these areas. There are ways to control and prevent malaria. One of these proposed methods can be vaccine production. Thus, a more effective vaccine is urgently needed to control the transmission of P. knowlesi.
Circumsporozoite protein (csp) is one of the targeted candidates for vaccine development in treating malaria. This protein is multifunctional in malaria transmission, including mediation of sporozoite development and assisting sporozoite migration from mosquitoes’ midguts to mammalian livers [13,14,15]. The csp gene has been shown to be a useful biomarker for delineating the phylogenetic relationship of Plasmodium species [16, 17], and a recent study investigated the genetic diversity of csp between P. knowlesi isolates from Malaysian Borneo and Peninsular Malaysia . However, the study was limited to only P. knowlesi isolates obtained from the Interior Division of Sabah in Malaysian Borneo. To obtain more accurate general interpretations, the present study adds new P. knowlesi isolates from the Sandakan Division of Sabah (another malaria hotspot area) to reveal more information about the genetic diversity of csp non-repeat regions between P. knowlesi isolates from Malaysian Borneo and Peninsular Malaysia.
Human blood samples and Plasmodium DNA extraction
A total of 46 human blood samples infected with P. knowlesi were collected from symptomatic malaria patients with informed consent in Sabah, Malaysian Borneo, from 2008 to 2011 (Table 1). The sample areas included Sandakan Division (13 samples from Telupid Health Clinic), Interior Division (6 samples from Keningau Hospital, 3 samples from Nabawan Health Clinic, 5 samples from Tambunan Hospital, and 6 samples from Tenom Hospital), and the University of Malaya Medical Centre, Peninsular Malaysia (N = 13). The presence of P. knowlesi parasites in the samples was confirmed by an experienced pathologist and verified using the PlasmoNex™ diagnostic system . Plasmodium DNA was extracted using a previously described approach . Ethical approval for this study was obtained from the Ethics Committee of University of Malaya Medical Centre (reference no. 709.2).
Polymerase chain reaction (PCR) amplification of csp gene
PCR amplification was carried out using a 20-μL reaction mixture containing 1X GoTaq® buffer (Promega, USA), 2.0 mM of MgCl2 solution, 0.2 mM of dNTPs, 0.2 μM of each primer, and 1.25 units of GoTaq® DNA polymerase (Promega, USA). The primers used for csp gene amplification were Pkcsp-F: 5′-TCC TCC ACA TAC TTA TAT ACA AGA-3′ and Pkcsp-R: 5′-GTA CCG TGG GGG ACG CCG-3′, which were derived from a previous study . The PCR conditions were 94 °C for 4 min followed by 40 amplification cycles at 94 °C for 30 s, 55 °C for 50 s, and 72 °C for 2 min, and a final extension step at 72 °C for 10 min. PCR amplicons were subjected to electrophoresis and analysed in 1% agarose gel stained with ethidium bromide.
Molecular cloning and sequencing
A QIAquick Gel Extraction Kit (Qiagen, Germany) was used to isolate PCR products from the agarose gel, and the purified PCR products were cloned into the pJET1.2/blunt vector using a CloneJET PCR Cloning Kit (Thermo Scientific, USA) according to manufacturer’s instructions. The ligased vectors were transformed into Escherichia coli strain JM109 using a conventional heat-shock method. The desired plasmid containing the csp gene from a single colony was extracted using a QIAprep Spin Miniprep Kit (Qiagen, Germany) according to the manufacturer’s recommendations and subjected to sequencing using pJET1.2 forward sequencing primer.
Sequence alignment and phylogenetic analysis of csp gene
In addition to the 46 csp sequences obtained in this study, a total of 168 csp sequences (Additional file 1) were retrieved from the GenBank database, including 33 sequences from macaques in Sarawak, 24 sequences from macaques in Singapore, 62 sequences from humans in Peninsular Malaysia, 33 sequences from humans in Sarawak, 14 sequences from humans in Singapore, and 2 sequences as an outgroup. Alignment was performed using the CLUSTAL-W tool in Molecular Evolutionary Genetic Analysis 6 (MEGA6) software . Next, the central repeat region of the csp gene was trimmed, whereas the N-terminal (first 195 bp of the coding csp gene) and the C-terminal (last 261 bp of the coding csp gene with exception of the stop codon) non-repeat regions were combined (total length = 453 bp) for the analysis. The maximum likelihood method was used to construct a phylogenetic tree with 1000 bootstrap replicates to test the robustness and reliability of the tree.
Csp sequence diversity, natural selection, and haplotype analyses
Variation in the combined csp sequences was determined using DnaSP ver. 5.10.01 . The data obtained included the average number of pairwise nucleotide differences (K), number of haplotypes (h), haplotype diversity (Hd), and nucleotide diversity (π). An advanced analysis of π was also performed on a sliding window of 100 bases with a step size of 25 bp to estimate the step-wise diversity of csp. In addition, the rates of synonymous (dS) and non-synonymous (dN) mutations were obtained and compared with a Z-test in MEGA6 using Nei and Gojobori’s approach  with Jukes and Cantor correction. For testing the neutral theory of evolution, Tajima’s D test  as well as Fu and Li’s D and F tests  were performed using DnaSP ver. 5.10.01. The median-joining star contraction approach in the software NETWORK ver. 184.108.40.206  was used to generate the relationship of csp haplotypes for isolates obtained in this study.
A total of 214 csp combined sequences were included in the phylogenetic tree analysis. The tree revealed indistinguishable geographical clusters of the P. knowlesi isolates included in this study (Fig. 1). A distinct cluster was observed for outgroups, and the P. knowlesi isolates were genetically closer to Plasmodium coatneyi based on the non-repeat csp sequence.
The nucleotide alignment of the csp in this study showed that the average number of pairwise nucleotide differences (K) was 8.958. The overall nucleotide diversity (π) and haplotype diversity (Hd) were 0.0199 ± 0.0005 and 0.9822 ± 0.0029, respectively. The detailed analysis of π with a sliding window length of 100 bp and step size of 25 bp revealed that the nucleotide position at 303–402 bp had the highest peak of nucleotide diversity (Fig. 2). The average rates of synonymous mutation (dS) and non-synonymous mutation (dN) were 0.037 and 0.015, respectively, and the dN/dS ratio was 0.405 (dS > dN; p < 0.05 in Z-test). In testing the neutral theory of evolution, Tajima’s D was − 1.830 (p < 0.05), whereas Fu and Li’s D and F were − 7.152 and − 5.519 (both p-values < 0.02), respectively.
Further amino acid analysis showed totals of 28 and 35 polymorphic sites in the N-terminal and C-terminal non-repeat regions, respectively (Additional file 2). For the N-terminal non-repeat region, there were 3 dimorphic changes [L18(F,P); S29(T,P); V34(I,L)] and 25 monomorphic changes. For the C-terminal non-repeat region, there were 6 dimorphic changes [H288(Q,R); T296(I,A); A315(G,Q); N319(K,R); D324(N,E); V332(A,L)] and 29 monomorphic changes. The csp amino acid sequences were categorized into 112 different haplotypes. In the network analysis, high frequencies of haplotype sharing were observed for P. knowlesi isolates between Malaysian Borneo and Peninsular Malaysia (i.e. H_6, H_8, H_12, H_13, and H_63), as well as between humans and macaques (i.e. H_1, H_5, etc.) (Fig. 3).
The csp protein densely coats the surface of Plasmodium parasites and has been reported to play an important role in sporozoite development and mammalian hepatocyte invasion [13, 14]. The central repeat region of csp is located between N-terminal and C-terminal non-repeat regions along the gene , which have been found with more than 40 various repeat units in different arrangements and lengths . This study solely focuses on the non-repeat regions of the csp gene because the highly polymorphic central repeat regions may introduce biasness and lead to misinterpretation in the phylogenetic and genetic diversity analyses.
No distinct geographical separation was observed in the phylogenetic tree. This suggests the similarity of the P. knowlesi isolates across different regions, including Malaysian Borneo and Peninsular Malaysia, as well as between humans and macaques based on the csp non-repeat regions. The findings are comparable to previous data [29,30,31]. Plasmodium knowlesi has been hypothesized to have migrated from the mainland of Southeast Asia to Malaysian Borneo during the Pleistocene era through migrating wild macaques when the two regions were still connected together [32, 33]. The present study further supports the hypothesis with the phylogenetically indistinguishable of P. knowlesi isolates observed in these regions. Similar to previous studies [5, 17, 29, 30], P. knowlesi was found to be genetically closer to P. coatneyi, indicating reliability of the constructed phylogenetic tree using the csp non-repeat regions.
The average number of pairwise nucleotide differences (K), nucleotide diversity (π) and haplotype diversity (Hd) for the combined non-repeat regions in this study were slightly higher than those reported by Fong et al. . The sliding window plot with a length of 100 bp and step size of 25 bp showed that the most diverse region was within the C-terminal non-repeat region represented by high amino acid variations in this region. The RTS,S malaria vaccine, which is in Phase III trials, targets the T-cell epitopes in the C-terminal region, and recent studies reported modest results in treating P. falciparum [34, 35]. However, insufficient study was conducted for P. knowlesi. Therefore, the variation data of the C-terminal non-repeat region in this study could be beneficial for vaccine development to treat P. knowlesi.
The dN/dS ratio revealed that the P. knowlesi isolates underwent purifying selection. Furthermore, the significant negative values of Tajima’s D as well as Fu and Li’s D and F further suggested purifying selection with population expansion of the zoonotic P. knowlesi, as evidenced by the abundance of csp haplotype sharing between humans and macaques. One possible explanation is the large areas of rapid deforestation activities in Malaysia, which have caused macaques and Anopheles vectors to move closer to human habitats . This could be promoting the spread of P. knowlesi parasites.
The close interaction between these three groups is also represented by the abundance of P. knowlesi infections in humans, which have been reported since 2010 [6,7,8,9,10,11]. Hence, better regulations for deforestation are urgently required in Malaysia, especially for forests inhabited by M. fascicularis (long-tailed macaque) and M. nemestrina (pig-tailed macaque). Such efforts would prevent further expansion of this simian parasite in human hosts.
In summary, this study has suggested that there is no distinct cluster of P. knowlesi isolates from different geographic regions including Malaysian Borneo and Peninsular Malaysia. The csp non-repeat regions of the zoonotic P. knowlesi isolates were found to undergo purifying selection with population expansion, which was further supported by the extensive haplotype sharing observed between humans and macaques. Nonetheless, novel mutations were observed in the C-terminal non-repeat regions of the csp gene, which could be beneficial for vaccine development to treat P. knowlesi parasites.
Availability of data and materials
The datasets analyzed in this study are available from the corresponding author on request.
Polymerase chain reaction
- csp :
Average number of pairwise nucleotide differences
Number of haplotypes
- dS :
Rate of synonymous mutation
- dN :
Rate of non-synonymous mutation
World Health Organization. World Malaria Report 2019. Geneva: World Health Organization; 2019.
World Health Organization. Global Technical Strategy for Malaria 2016–2030. Geneva: World Health Organization; 2015.
Singh B, Lee KS, Matusop A, Radhakrishnan A, Shamsul SSG, Cox-Singh J, et al. A large focus of naturally acquired Plasmodium knowlesi infections in human being. Lancet. 2004;363:1017–24.
Lee KS, Cox-Singh J, Singh B. Morphological features and differential counts of Plasmodium knowlesi parasites in naturally acquired human infections. Malar J. 2009;8:73.
Vythilingam I, Noorazian YM, Huat TC, Jiram AI, Yusri YM, Azahari AH, et al. Plasmodium knowlesi in humans, macaques and mosquitoes in Peninsular Malaysia. Parasit Vectors. 2008;1:26.
Lee CE, Adeeba K, Freigang G. Human Plasmodium knowlesi infections in Klang Valley Peninsular Malaysia: a case series. Med J Malaysia. 2010;65:63–5.
Lau TY, Joveen-Neoh WF, Chong KL. High incidence of Plasmodium knowlesi infection in the interior division of Sabah, Malaysian Borneo. Int J Biosci Biochem Bioinform. 2011;1:163–7.
Goh XT, Lim YAL, Vythilingam I, Chew CH, Lee PC, Ngui R, et al. Increased detection of Plasmodium knowlesi in Sandakan division, Sabah as revealed by PlasmoNexTM. Malar J. 2013;12:264.
William T, Rahman HA, Jelip J, Ibrahim MY, Menon J, Grigg MJ, et al. Increasing incidence of Plasmodium knowlesi malaria following control of P. falciparum and P. vivax malaria in Sabah Malaysia. PLoS Negl Trop Dis. 2013;7:e2026.
William T, Jelip J, Menon J, Anderios F, Mohammad R, Awang Mohammad TA, et al. Changing epidemiology of malaria in Sabah, Malaysia: increasing incidence of Plasmodium knowlesi. Malar J. 2014;13:390.
Lee PC, Chong ETJ, Anderios F, Lim YAL, Chew CH, Chua KH. Molecular detection of human Plasmodium species in Sabah using PlasmoNexTM multiplex PCR and hydrolysis probes real-time PCR. Malar J. 2015;14:28.
Lee KS, Vythilingam I. Plasmodium knowlesi: emergent human malaria in Southeast Asia. In: Lim YAL, Vythilingam I, editors. Parasites and their vectors: a special focus on Southeast Asia. Vienna: Springer; 2014.
Ménard R, Sultan AA, Cortes C, Altszuler R, van Dijk MR, Janse CJ, et al. Circumsporozoite protein is required for development of malaria sporozoites in mosquitoes. Nature. 1997;385:336–40.
Coppi A, Natarajan R, Pradel G, Bennett BL, James ER, Roggero MA, et al. The malaria circumsporozoite protein has two functional domains, each with distinct roles as sporozoites journey from mosquito to mammalian host. J Exp Med. 2014;208:341–56.
de Camargo TM, de Freitas EO, Gimenez AM, Lima LC, de Almeida CK, Francoso KS, et al. Prime-boost vaccination with recombinant protein and adenovirus-vector expressing Plasmodium vivax circumsporozoite protein (CSP) partially protects mice against Pb/Pv sporozoite challenge. Sci Rep. 2018;8:1118.
McCutchan TF, Kissinger JC, Touray MG, Rogers MJ, Li J, Sullivan M, et al. Comparison of circumsporozoite proteins from avian and mammalian malarias: biological and phylogenetic implications. Proc Natl Acad Sci USA. 1996;93:11889–94.
Vargas-Serrato E, Corredor V, Galinski MR. Phylogenetic analysis of CSP and MSP-9 gene sequences demonstrates the close relationship of Plasmodium coatneyi to Plasmodium knowlesi. Infect Genet Evol. 2003;3:67–73.
Fong MY, Ahmed MA, Wong SS, Lau YL, Sitam F. Genetic diversity and natural selection of the Plasmodium knowlesi circumsporozoite protein nonrepeat regions. PLoS ONE. 2015;10:e0137734.
Chew CH, Lim YA, Lee PC, Mahmud R, Chua KH. Hexaplex PCR detection system for identification of five human Plasmodium species with an internal control. J Clin Microb. 2012;50:4012–9.
Lim YAL, Rohela M, Chew CH, Thiruventhiran T, Chua KH. Plasmodium ovale infection in Malaysia-first imported case. Malar J. 2010;9:272.
Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30:2725–9.
Librado P, Rozas J. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009;25:1451–2.
Nei M, Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol. 1986;3:418–26.
Tajima F. Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989;123:585–95.
Fu YX, Li WH. Statistical tests of neutrality of mutations. Genetics. 1993;133:693–709.
Bandelt HJ, Forster P, Röhl A. Median-joining networks for inferring intraspecific phylogenies. Mol Biol Evol. 1999;16:37–48.
Ozaki LS, Svec P, Nussenzweig RS, Nussenzweig V, Godson GN. Structure of the Plasmodium knowlesi gene coding for the circumsporozoite protein. Cell. 1983;34:815–22.
Lee KS, Divis PCS, Zakaria SK, Matusop A, Julin RA, Conway DJ, et al. Plasmodium knowlesi: reservoir hosts and tracking the emergence in humans and macaques. PLoS Pathog. 2011;7:e1002015.
Wong JPS, Tan CH, Lee V, Li IMZ, Lee KS, Lee PJ, et al. Molecular epidemiological investigation of Plasmodium knowlesi in humans and macaques in Singapore. Vector Borne Zoonotic Dis. 2011;11:131–5.
Sermwittayawong N, Singh B, Nishibuchi M, Sawangjaroen N, Vuddhakul V. Human Plasmodium knowlesi infection in Ranong province, southwestern border of Thailand. Malar J. 2012;11:36.
Loh JP, Gao QHC, Lee VJ, Tetteh K, Drakeley C. Utility of COX1 phylogenetics to differentiate between locally acquired and imported Plasmodium knowlesi infections in Singapore. Singapore Med J. 2016;57:686–9.
Voris HK. Maps of Pleistocene sea levels in Southeast Asia: shorelines, river system and time durations. J Biogeogr. 2000;27:1153–67.
Singh B, Daneshvar C. Human infections and detection of Plasmodium knowlesi. Clin Microbiol Rev. 2013;26:165–84.
Ballou WR. The development of the RTS, S malaria vaccine candidate: challenges and lessons. Parasite Immunol. 2009;31:492–500.
Campo JJ, Sacarlal J, Aponte JJ, Aide P, Nhabomba AJ, Dobaño C, et al. Duration of vaccine efficiency against malaria: 5th year of follow-up in children vaccinated with RTS, S/A S02 in Mozambique. Vaccine. 2014;32:2209–16.
Vythilingam I. Plasmodium knowlesi in humans: a review on the role of its vectors in Malaysia. Trop Biomed. 2010;27:1–12.
This work was supported by the Ministry of Higher Education, Malaysia (FRGS0322-SG-1/2013).
Ethics approval and consent to participate
Informed verbal consent was obtained from the patients and ethical approval for this study was obtained from the Ethics Committee of University of Malaya Medical Centre (reference no. 709.2).
Consent for publication
The authors declare that they have no competing interest.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional file 1
: Csp sequences retrieved from the GenBank database.
Additional file 2
: Amino acid polymorphisms in the csp N-terminal and C-terminal coding sequences.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Chong, E.T.J., Neoh, J.W.F., Lau, T.Y. et al. Genetic diversity of circumsporozoite protein in Plasmodium knowlesi isolates from Malaysian Borneo and Peninsular Malaysia. Malar J 19, 377 (2020). https://doi.org/10.1186/s12936-020-03451-x
- Plasmodium knowlesi
- Circumsporozoite protein
- genetic diversity
- Malaysian Borneo