Functional analysis of Plasmodium falciparum subpopulations associated with artemisinin resistance in Cambodia
- Ankit Dwivedi1, 2, 13Email author,
- Christelle Reynes3, 4,
- Axel Kuehn2,
- Daniel B. Roche1, 5,
- Nimol Khim6,
- Maxim Hebrard1, 7, 8,
- Sylvain Milanesi1,
- Eric Rivals1, 7,
- Roger Frutos9, 10,
- Didier Menard6, 11,
- Choukri Ben Mamoun12,
- Jacques Colinge2 and
- Emmanuel Cornillot1, 2Email authorView ORCID ID profile
© The Author(s) 2017
Received: 11 July 2017
Accepted: 12 December 2017
Published: 19 December 2017
Plasmodium falciparum malaria is one of the most widespread parasitic infections in humans and remains a leading global health concern. Malaria elimination efforts are threatened by the emergence and spread of resistance to artemisinin-based combination therapy, the first-line treatment of malaria. Promising molecular markers and pathways associated with artemisinin drug resistance have been identified, but the underlying molecular mechanisms of resistance remains unknown. The genomic data from early period of emergence of artemisinin resistance (2008–2011) was evaluated, with aim to define k13 associated genetic background in Cambodia, the country identified as epicentre of anti-malarial drug resistance, through characterization of 167 parasite isolates using a panel of 21,257 SNPs.
Eight subpopulations were identified suggesting a process of acquisition of artemisinin resistance consistent with an emergence-selection-diffusion model, supported by the shifting balance theory. Identification of population specific mutations facilitated the characterization of a core set of 57 background genes associated with artemisinin resistance and associated pathways. The analysis indicates that the background of artemisinin resistance was not acquired after drug pressure, rather is the result of fixation followed by selection on the daughter subpopulations derived from the ancestral population.
Functional analysis of artemisinin resistance subpopulations illustrates the strong interplay between ubiquitination and cell division or differentiation in artemisinin resistant parasites. The relationship of these pathways with the P. falciparum resistant subpopulation and presence of drug resistance markers in addition to k13, highlights the major role of admixed parasite population in the diffusion of artemisinin resistant background. The diffusion of resistant genes in the Cambodian admixed population after selection resulted from mating of gametocytes of sensitive and resistant parasite populations.
Malaria is one of the widespread parasitic infections affecting humans and World Health Organization malaria elimination programmes are threatened by the emergence of drug resistance. Cambodia is considered to be the epicentre of Plasmodium falciparum resistance to the first-line treatment, artemisinin-based combination therapy . The emergence and existence of artemisinin resistant parasites is now reported at different locations in Greater Mekong Subregion (GMS) [2–10]. Understanding the mechanism underlying resistance is crucial for the identification of new drug targets and restriction of spread of artemisinin resistance to high malaria transmission areas such as Africa.
The fragmentation of Cambodian P. falciparum population into artemisinin resistant (founder subpopulations), sensitive (ancestral population) and admixed subpopulations was shown in recent reports [11–13]. Furthermore, four non-synonymous mutations (C580Y, Y493H, R539T and I543T) in the propeller domain of the Kelch gene k13 (PF3D7_1343700), located on chromosome 13, were described as determinants of clinical artemisinin resistance in Cambodia . Nine other non-synonymous mutations in the k13 were associated with clinical artemisinin resistance in GMS [2–10]. The k13 mutant alleles exhibiting clinical artemisinin resistance have not been observed outside the GMS . Some of the k13 artemisinin resistant alleles have emerged independently with different geographical origins in different locations in SEA [4, 8] and have spread to neighboring countries . This could be due to different selective pressure at different locations . The most dominant and transmissible allele reaching fixation is observed to be C580Y in Cambodia, Thailand–Myanmar border, Vietnam and southern Laos, with a single origin in western Cambodia [8–10, 15, 16]. There could be certain essential genetic background mutations facilitating the artemisinin drug resistance through selection, a similar phenomenon observed with the resistance to chloroquine, pyrimethamine or sulfadoxine in the 1960s in the GMS [1, 17]. Emergence was also suspected to be related to the small and well-structured parasite population in the GMS  The relative low transmission preventing the development of protective immunity in the human population may have helped emergence of drug resistance, especially at the border between Cambodia and Thailand . The other reason for emergence of resistance could be the early introduction of anti-malaria drug based treatments, with inappropriate dose or usage that may have led to continual drug exposure of malarial parasite populations [20, 21].
Based on genome wide analysis, four genes Pffd, Pfcrt, Pfarps10 and Pfmdr2, have been described as background genes supporting k13 resistant alleles . Although, these markers certainly determine artemisinin resistance, the underlying resistance mechanism is still unknown. Mbengue and colleagues have described a K13-PI3K pathway to be potentially associated with artemisinin resistance, and that PI3K could be targeted for understanding resistance mechanism . Also, biological process and pathways like ubiquitination, oxidative stress response and unfolded protein response pathways, are associated with artemisinin resistance [23, 24]. The present study aims to associate the biological processes and metabolic pathways with the early P. falciparum population structure in Cambodia, by analysing mutations in parasites isolated in Cambodia during the early period of emergence of artemisinin resistance (2008–2011). An integration of population structure, genetic variations, networks and annotation data was performed using a theory described here as emergence-selection-diffusion (ESD) model. Bioinformatic results on significant mutations and related biological pathways were used to test the model and evaluate the process of emergence and selection in specific small subpopulations, followed by diffusion in an admixed population. This model defined for a descriptive approach is similar to the population shifting balance and metapopulation theories in population genetics that were successfully implemented for the analysis of P. falciparum populations [25, 26].
In total, 21,257 non-synonymous SNPs were identified using specific filters among 167 isolates from the genomic dataset on which the parasite population structure was described earlier . Based on hierarchical clustering and network-based stratification , eight specific subpopulations were identified among four regions in Cambodia. An extended list of 57 background genes associated with artemisinin resistant parasites in Cambodia is described. In addition to the identification of known targets, functional analysis reveals a strong interplay between ubiquitination and cell division in artemisinin resistant parasites.
The genome sequencing data of Cambodian parasite isolates was recovered from the ENA (European Nucleotide Archive) database server, submitted by the Welcome Trust Sanger Institute. Out of the 293 P. falciparum whole genome sequences from 2008 to 2011, which were used to define the P. falciparum parasite population structure in Cambodia [11, 12], 167 sequences were recovered successfully (Additional file 1) in BAM format and converted to much readable VCF (v4.1) files  using SAMtools v0.1.19 . Plasmodium falciparum 3D7 strain (genome sequence annotation version 2) was used as the reference genome, and was recovered from the PlasmoDB Plasmodium Genomics resource database (release version 5.5) . The 167 genome sequences originate from four locations: Pailin (14 sequences), Tasanh (26 sequences) and Pursat (81 sequences) in western Cambodia and Ratanakiri (46 sequences) in eastern Cambodia.
Filtering data for noise
A reliable variant calling pipeline was established for the identification of significant SNPs (Additional file 2). The analysis was focused on non-synonymous SNPs in the coding region of the genome, which occur in at least one of the isolates, and does not take insertion/deletion (INDELs) into consideration. The SNP data was filtered for noise based on the VCF (v4.1) file signal parameters. For identification of thresholds on these parameters to filter the data, at first, around 100 kb were removed from starting and end of each chromosome. These 100 kb regions at the extremity of P. falciparum chromosomes encode for multi-gene families, most of them encoding putative surface antigens (including VAR, Rifin, Stevor). An average value for all the signal parameters over 167 isolates were calculated for each SNP position as sum of the values in different isolates over the number of isolates having a non-reference allele. The average quality values per position were plotted along the genome, and a clear threshold in the mapping quality (MQ) parameter was observed at value 29 (Additional file 3). Removing chromosome ends removed most of the signal below the score 29. Therefore, mapping quality score higher than 29 was considered as one of the filtering criteria. Many SNPs showed MQ ≤ 29 in the coding core of chromosome 4 and 7, as internal VAR gene clusters are present in these chromosomes. No other quality parameters plotted along the genome, with or without the chromosome end regions showed any change in distribution of SNPs, on which a significant threshold could be defined. In order to select the SNPs with high quality non-reference (non-REF) signal, DP4, the parameter accounting for high-quality forward strand REF bases, reverse strand REF bases, forward strand non-REF bases and reverse strand non-REF bases for a specific position was analysed. To account for the percentage of non-REF signal (referred to as DA in the manuscript) and choose a threshold, the distribution of proportion of non-REF alleles (forward strand non-REF bases + reverse strand non-REF bases) over the sum of all alleles was analysed. The value 0.7 was chosen, as it seemed to be the intersection of two distributions: low quality SNPs on the left and high quality SNPs on the right (Additional file 4). Therefore, DA score ≥ 0.7 was considered as the second filtering criteria to include SNPs with high quality non-REF allele calls. Around 20,000 SNPs were recovered for each isolate after implementing these two filters (min: 13,470, max: 23,022 SNPs). There were 247,783 SNPs having a non-REF allele in at least one of the 167 Cambodian parasite genomes.
In order to remove the SNPs with rare allele frequencies in the population, minor allele frequency (MAF) for each SNP was analysed (Additional file 5). It is defined as the minimum of non-reference allele frequency (NRAF) and 1-NRAF [11, 31]. All the SNPs with a minor allele frequency (MAF) less than 0.01796 were removed from the analysis (Additional file 5). After applying the three filters, 111,701 unique SNPs were recovered in 167 Cambodian isolates.
Correspondence between different genome versions
The correspondence of the SNP coordinates between P. falciparum genome sequence version 2 and 3 was recovered using BLAST (Basic Local Alignment Search Tool) from NCBI (National Centre for Biotechnology Information), for each chromosome separately. For the chromosomes 4, 7, 8, 10 and 13 very short lengths of alignments were obtained depicting major changes in the genomic sequence between version 2 and 3. Correspondence for these chromosomes was then obtained by defining specific regions for BLAST. Correspondence for approximately all SNPs in all chromosomes was recovered, and the 3105 unmapped SNPs in the recovered dataset were removed from the analysis. NCBI BLAST was performed for each chromosome, with genome version 2 (PlasmoDB release version 5.5) as reference and version 3 (PlasmoDB release version 10) as query, to recover a list of 108,596 SNPs.
Removing uncertain SNPs and correcting errors
After filtering the data and removing the unmapped SNPs, the uncertainties were treated. SNPs with more than one ALT (non-REF) allele for a specific isolate were considered as uncertainties. SNPs having the uncertain ALT allele frequency higher than 40% were removed (830 SNPs). For the cases where SNPs had only one ALT allele in most of the isolates and uncertain ALT allele in some isolates, the uncertain allele was substituted with the ALT value (16,859 SNPs). For the other cases where SNPs had more than one ALT allele for different isolates, the uncertain ALT alleles were substituted with the most frequent ALT allele (1772 SNPs). The ALT alleles with the frequency 1.5 times the frequency of same allele at random, are considered as the most frequent ALT allele for SNP. In the case of uncertain ALT allele frequency less than 5% and no majority ALT allele, the uncertain ALT allele was substituted with the REF allele (54 SNPs). All the other cases were removed from the analyses (1228 SNPs). A total of 106,538 unique SNPs were recovered in 167 Cambodian isolates with 18,683 modified SNPs (Additional file 6).
Annotation of the recovered SNPs
Chromosome end coordinates
Positions excluded from start of chromosome
Positions excluded from end of chromosome
Chromosome length (bp)
The final set of 21,257 non-synonymous SNPs in the coding region was described, using the recovered annotation and chromosome ends region coordinates (Additional file 7). This dataset is referred to as IBC dataset in the manuscript and is used for parasite population study in Cambodia. There are 3714 modified SNPs (as described in the section above) in the set of these 21,257 SNPs.
To describe the P. falciparum population structure in Cambodia, unsupervised hierarchical clustering was performed on the IBC dataset (all the statistical analysis is performed in R v3.0.1 and v3.2.3). The pairwise distance between two isolates was estimated as the proportion of base substitution between them over the whole set of recovered SNPs. Ward minimum variance method was used as a metric to build the dendrogram. The correspondence between previously described parasite subpopulations in Cambodia  and the 167 isolates were recovered from the Sanger Institute. Eight subpopulations were described based on the hierarchal clustering results: KH1.1, KH1.2, KH2.1, KH2.2, KH3, KH4, KH5 and KHA. In order to choose the optimal number of clusters in the dendrogram, the value of k was set to 2 to 10 and the clusters obtained at k = 8 overlapped both, clusters based on different k13 alleles and the previously described KH subpopulations. By further increasing the k, only the admixed subpopulation KHA was further divided into small subpopulations.
Significant SNPs and genes
Number of significant SNPs and genes in each subpopulation compared to ancestral artemisinin sensitive subpopulation
Number of isolates
Number of significant SNPs
Number of significant genes
Gene interaction networks and gene ontology
In order to determine the biological processes associated with different subpopulations, gene–gene interaction networks were analysed. The interaction network data was recovered from STRING v10, which provides functional and predicted protein–protein interactions from other publicly available data sources and literature . Networks based on co-expression data for all the subpopulations based on significant genes were recovered. Only edges with a STRING confidence score for co-expression higher than 0.5 were considered for the analysis. The networks were imported and analysed in Cytoscape v3.2.1 .
Number of genes in P. falciparum gene–gene interaction network with maximum expression in different parasite blood stage forms
Number of genes
For identification of biological function associated to the significant genes, the functionally grouped networks of GO terms and pathways were recovered and analysed for each subpopulation using the ClueGO v2.2.4  and CluePedia v1.2.4  plugins of Cytoscape. The network is created with nodes as the term and edges as the association based on kappa score. All the default conditions of the ClueGO plugin were used. Right-sided hypergeometric test (enrichment) was used as the statistical test and the Benjamini–Hochberg method was used to correct p values.
Network based subpopulation description
The population structure of 167 parasite isolates was also questioned using the method of Network Based Stratification  which clusters the isolates together having diffusion paths associated to mutated genes in similar network regions. The list of mutated genes for each isolate was projected on the full P. falciparum interaction network recovered from STRING v10 . All prediction sources such as co-expression, co-occurrence, gene fusion, databases, experimental evidence, text-mining and neighbourhood were considered for the full P. falciparum interaction network. The results were obtained for the top 10% gene–gene interactions based on combined confidence score provided by STRING database . The mutated genes were propagated to the neighbourhood network (network smoothing) and based on the node score matrix of the resulting diffusion network for each isolate, clustering was performed using non-negative matrix factorization (NMF) method . Consensus clustering was performed by selecting 80% mutated genes and 80% isolates 100 times randomly and iterating NMF clustering 10 times. The consensus clustering between samples was estimated as the percentage of co-clustering results in which they are in the same cluster when the dendrogram is cut to obtain 8 groups (to make the comparison easy with the 8 KH sub-populations). This consensus clustering matrix was normalized and then used to build a dendrogram for all 167 isolates using Euclidean distance matrix and ward minimum variance method in R.
Some of the isolates were grouped in different clusters, compared to the clustering results based on 21,257 SNPs. There were 10 isolates (isolate index: 9, 70, 91, 103, 112, 134, 148, 151, 160, 162) clustering in KHA (based on 21,257 SNPs), but according to clustering based on network based stratification these isolates clustered in other resistant and sensitive subpopulations. Also one isolate (isolate index: 156) was classified in KH2.2 which was in KH2.1 earlier and one isolate (isolate index: 61) is classified in KH2.1 which was in KH3 earlier. All the other 155 isolates completely overlaps with the clusters observed previously.
Ribosome S10 protein structure prediction
Models were built using four independent structure prediction servers, IntFOLD , Phyre2 , RaptorX  and LOMETS . For each model, the top model from each of the four servers was analysed using ModFOLDclust2  and manually inspected, showing all models had the same fold. The RaptorX models were selected as the best representative models according the ModFOLDclust2 score and manual inspection. Structural superposition of the RaptorX models was undertaken using the TM-align method , which produces a TM-score between 0 and 1, with scores above 0.5 indicating the same fold and scores close to 1 indicating a high degree of structural similarity of the two proteins.
Description of P. falciparum subpopulations
The parasite population structure in Cambodia was also confirmed using a stratification method  based on co-expression network data recovered from STRING database . Only some admixed subpopulation KHA samples were differentially classified, suggesting potential close relationship with the donor subpopulations in terms of gene expression (Additional file 9). Consequently, out of the 49 isolates in this subpopulation, 10 isolates were clustering with other subpopulations (Fig. 1b). The clustering of 6 mutant k13 and 4 wildtype k13 isolates with the ART-R donor subpopulations was intriguing regarding the diffusion of k13 alleles after mating between gametocytes. For example, no ART-R isolates were found with a KH1 background of gene expression. Furthermore, clustering shows that the KH1.2 ART-S subpopulation is robust and gene expression network might be closer to ART-R subpopulations than ART-S subpopulation KH1.1, suggesting the presence of P. falciparum parasite subpopulations with specific genetic backgrounds in Cambodia before the introduction of ART.
Subpopulation associated genetic background
The whole 265 significant genes displayed GO term association with cell signaling, gene expression, post-translational regulation, RNA metabolism (tRNA and mRNA modification…) and organelle related functions (Additional file 12). The 168 genes set GO annotation revealed conserved enrichment for cell signaling, gene expression and organelle functions (Additional file 13). About 60/265 genes encode membrane proteins encompassing 19 surface antigens including msp1 and some specific var genes (Additional file 14). Several genes are associated with cell signaling and a significant number of these are protein kinases (7/17). The Pfmdr2 gene is part of the 168 genes set and is associated with the background D484I mutation, which was present in KH1.2 ART-S subpopulation.
The ARPS10 (apicoplast ribosomal protein S10) encoding gene was found to carry two different mutations V127 M and D128H, which were significantly enriched in ART-R and admixed subpopulations, whereas KH1.2 had only D128H mutation. The V127 M mutation was previously described in Neisseria gonorrhea to be associated with tetracycline resistance . Protein model prediction revealed that ARPS10 central domain is highly similar to the bacterial protein, and that the mutated valine is at the same place in a loop (Additional file 15). Plasmodium falciparum doxycycline resistance was not characterized so far in SEA, but background mutations may contribute to lower efficiency of antibiotic therapy that remains to be evaluated.
Description of ART-R subpopulations genetic background
To assess the ART-R subpopulations background genes, the 168 genes set found in common with KH1.2 ART-S subpopulation were removed from the total of 265 genes, and the remaining genes were analysed (Fig. 3b; Additional file 10). The 97 genes were distributed in the Ring and Schizont categories of the co-expression network (Fig. 3c). An enrichment for the GO term “anion binding” is the only one which is reminiscent of cell signaling pathway in the remaining set of 97 genes (Additional file 16). Indeed, protein kinase activity is now restricted to the presence of ARK2 (Aurora Related Kinase 2), a potential drug target , and the CCAAT-box DNA binding protein is the only transcription factor remaining at this step. Analysis also brings out the association of the 97 genes set with specific mutations in the PDE1 phosphodiesterase, which is involved in the regulation of cGMP concentration. The pde1 gene carries 3 mutations found in some individuals of the KH1.1 ancestral population, but only S506L was fixed in all the ART-R subpopulations (Additional file 17). Specificity of PDE1 for cGMP was characterized experimentally  and the role of the cyclic nucleotides in parasite division and differentiation was well established . Genes in the same GO term functional group as k13 are associated with poly-ubiquitination [MAL7P1.19 Ubiquitin transferase (UT) and rpn10 proteasome subunit], signal transduction (pi3k), autophagy (atg7 and atg18) and ferredoxin (Additional file 16). Background mutations associated with the ferredoxin gene (fd) were found to be significant in all ART-R and admixed subpopulations. Interestingly the ferredoxin D193Y background mutation was not always associated with the k13 mutations in the KH3 subpopulation. Ferredoxin gene is also part of a functional group of genes encoding apicoplast proteins of unknown function. The PfCRT encoding gene is also part of ART-R subpopulations background and was associated with 13 mutations including the two background mutations N326S and I356T. KH1.2 had no significant Pfcrt mutations. An additional R371I mutation was found in ART-R and the admixed subpopulation KHA. KH4 isolates carry one more significant mutation T256I.
The background genes of artemisinin resistance associated pathways
Parasite population characterization suggests that emergence of k13 mutations in Cambodia was tightly related to the presence of subpopulations like KH2, KH3, KH4 and KH5, which provides an appropriate background for acquisition of artemisinin resistance. In the admixed subpopulation KHA, 22 isolates out of the 49 KHA isolates had no k13 allele. This high proportion may be related to the focus of the present study on the early period of emergence of artemisinin resistance (2008–2011). Out of the 5 background genes mentioned above, the fd gene was the only significant gene in the KHA subpopulation when compared to the total population of 167 isolates. When KHA was compared with the ancestral population KH1.1 as reference, the number of significant genes increased to 467. These KHA specific gene list overlaps with that of ART-R 97 genes set. The presence of overlapping functions and genes (Fig. 3d) supports the idea that KHA genetic background in the period 2008–2011 was acquired through crossing of ART-R parasites with various populations either ancestral, resistant or a pre-existing admixed population as suggested by the comparison of the stratification approaches used in this study (Fig. 1; Additional file 9). The artemisinin resistance genetic background was defined as the intersection between ART-R 97 subpopulation background genes and the 467 KHA significant genes (Fig. 3b; Additional file 18), and a final list of 57 genes was recovered (Additional file 10).
The 57 genes set is supported by 213/230 mutations that were significant in KHA and at least one of the resistant subpopulation (Additional file 7). Diffusion of the ART-R subpopulations genetic background is also demonstrated by the fact that parasites having one of the k13 resistant allele present the higher proportion of these mutations (Additional file 19). Genes had a median of 3 mutations with a maximum of 27 mutations in PFL0940c of the var genes family (Additional file 20). High number of mutations in some samples from specific subpopulation was in agreement with the observation that some KHA isolates show different classification between hierarchical clustering and network based stratification (Fig. 1; Additional file 9).
Interplay between ubiquitination and cell signalization pathways
Relationship between protein turn-over at the proteasome and artemisinin resistance is supported by missense mutations in K13. K13 is expected to work as part of an E3-like protein complex interacting with the target proteins through the Kelch domain. Poly-ubiquitination of PI3K kinase mediated by K13 was described experimentally , but the E3 ubiquitin ligase containing K13 may also target many more proteins. Ubiquitin binds to lysine, cysteine, serine or threonine residues or at the N-terminus on the protein substrate. The final 57 genes set presents 95 missense mutations concerning such residues over the 230 mutations that are significant in at least one subpopulation. More information will be needed to identify those who are indeed targeted by the K13-associated E3-complex. The S102C mutation found in the P. falciparum derlin protein (DER1-1) is intriguing, as it is the only significant mutation observed in this gene. Derlin1 is involved in polyubiquitination of ER-located proteins and was also found to be located at the apicoplast in P. falciparum. The K523 N mutation is observed in ARK2 (Aurora Related Kinase 2), which was fixed in the KHA subpopulation. ARK2 is a serine/threonine protein kinase and is essential for asexual parasite development . Mutations in Pfark2 are all clustered in two regions of about 200 amino acids in the N-terminus part of the protein (Fig. 5). Most of these mutations globally increase the level of hydrophobicity of the ARK2 N-terminus domain of the protein suggesting that these two regions are probably part of a protein interface.
ATG18, DER1-1 and PDE1 are the three proteins out of the 12 with a single mutation from the 57 genes set, where a serine or a threonine is lost by mutation. The S506 PDE1 residue could modulate phosphodiesterase activity regulating cGMP concentration and activity of the single cGMP dependent protein kinase PKG found in P. falciparum (PF14_0346). Additionally, no connections were found with transcription factors. The two ApiAP2 protein which were in 265 genes set were absent in the 97 genes set. The SET4 and CAAT-binding proteins were not conserved in the final 57 genes set. Other approaches must be considered to look for possible inference of the mammals KEAP1/NRF2 model in the P. falciparum artemisinin resistance mechanism .
Fixation of mutations in the ART-R subpopulations
Genetic background of subpopulations suggests that resistance to artemisinin appears as a selection process among parasites from subpopulations. In the case of k13, different subpopulations carry different alleles: Y493H, R539T or C580Y. The gene is unique among all ART-R related genes because it is the only one for which mutually exclusive mutations are found. The k13 gene was found to be significant in donor populations where only one k13 mutation was present (p = 3.92E−11), but also in KH5 (p = 1.0296E−4) where all the 3 different exclusive mutations were present. KH5 is a donor for R539T mutation, as it is absent in other subpopulations except KHA. The fixation of k13 mutations was not so different from the other genes of the ART-R subpopulations genetic background. There are 524 mutations occurring in the 97 genes that were significant in at least one ART-R subpopulation (Additional file 7). None of the k13 mutations were found in the KH1.1 ancestral population, and this was also true for several mutations related to the 97 genes set (154/524). These 154 mutations suggest that ART-R subpopulations specific mutations that were not found in the KH1.1 samples were certainly rare in the ancestral population, including k13. The genetic drift during the emergence of subpopulations and the selection performed afterwards is also characterized by the presence of only 57/524 significant mutations that are found in all the ART-R subpopulations, but these mutations are located in 46/97 genes of the ART-R subpopulations genetic background confirming that selection process may have targeted specific pathways. Altogether, these results suggest that selection of the artemisinin genetic background happened in the daughter subpopulations and not in the ancestral population.
Role of KHA parasites in the diffusion of resistance to artemisinin
The process of emergence of artemisinin resistance could be summarized with emergence-selection-diffusion model (ESD). This model is similar to the population shifting balance theory described by Wright . The present work describes the emergence of common genetic background in different subpopulations. Parasites among these subpopulations presented a large number of fixed mutations compared to the ancestral population KH1.1. Genetic drift in small populations and relationship with resistance were described previously [4, 11–13]. Emergence of k13 resistant alleles was only observed in the genetic background of these subpopulations and alleles were clearly associated with specific subpopulations. This mechanism has now been confirmed worldwide .
Presence of ART-R subpopulation specific mutations associated with known drug resistance markers such as Pfcrt N326S and I356T background mutations or R371I found in Dd2 (in a context where K76T is present in 34/167 samples including KH1.1 samples), S436A and K540 N in Pfdhps and T484I background mutation in Pfmdr2 (in combination with a new G299D mutation) further support the idea that artemisinin resistant parasites were selected among subpopulations in a context where fixation of mutations could be speed up by drug pressure [4, 59–62]. In the same way, the results suggest that background mutations in Pfarps10 are possibly associated with doxycycline resistance even though there is little evidence of P. falciparum resistance to this antibiotic in Cambodia. Doxycycline resistance in P. falciparum can be explained in part by pfmdt and pftetQ copy number (PFE0825w and PFL1710c, respectively) and polymorphisms on pftetQ, which encode transporters . None of these two genes show any variation in the 167 isolates, but copy number was not evaluated. The fixation of the arps10 mutations may result from the use of doxycycline against P. falciparum in combination with quinine, or in anti-malarial prophylaxis . It should be considered that Pfarps10 background mutations are a side effect of the wide use of tetracycline derivatives in Cambodia, especially in the suspicion of leptospirosis epidemics . The Pfarps10 mutation contributes little to tetracycline resistance in Neisseria gonorrhoea alone, but was responsible for high resistance in combination with efflux pumps . It can be imagined that high resistant P. falciparum parasites can emerge by crossing between KHA samples with Africans isolates, where high copy numbers of transporters were described .
The present study considers an Emergence-Selection-Diffusion model (ESD) for description of biological information related to mutations accumulating in parasites during the early time period of emergence of artemisinin resistance (2008–2011). This descriptive study showed that random drift in subpopulations may help the emergence of mutations in specific biological pathways. The 168 genes set was associated with cell signaling, gene expression, organelle functions and surface antigens such as msp1 and some var genes. Analysis allow the detection of common features between resistant subpopulations that are result from the selection of drug resistant parasites. The 57 background genes found in the present study, associated with the selection of k13 mutants, are encoding proteins involved in ubiquitination, autophagy and cell signaling enzymes such as ARK2, PI3K and PDE1 which are involved in cell division and differentiation. Relationship with other malaria drug resistance markers was also present in the subpopulations. Possible resistance to tetracycline emphasize that the population structure in the GMS is promoting rapid emergence of drug resistance as hypothesized by the population shifting balance or metapopulation theories.
Diffusion of the k13 alleles related to the early time of emergence of artemisinin resistance in Cambodia have still not diffused outside of the country, at least up to 2016 when the last large survey was performed [5, 10]. Nevertheless, the present study emphasizes the important role of KHA parasites in diffusion of artemisinin resistance in Cambodia, which are more prone to crossing with other parasites according to the ESD model and similar population genetic theories. The fear of diffusion of k13 alleles outside GMS is real now, as some are fixed in KHA parasites, but this will need to be confirmed experimentally. The functional analysis of genomic data also suggests that biological functions associated with the 57 background genes could either be the result of environmental pressure or intrinsic consequence of genetic drift in parasites with low fitness. Several resistant markers to known drugs were found in KHA samples. This population may play a central role facilitating emergence of resistant parasites under the pressure of anti-malaria drugs. The present study of the early emerging artemisinin resistant populations suggest that newly emerging parasites are susceptible to be resistant to several anti-malarial drugs. The fitness of such parasites is under question, but tracking the diffusion of these new genetic background outside the GMS should certainly be extend to additional markers than the k13 gene alone.
Conceived and designed experiments: CR, DM, and EC. Retrieved and analysed sequence data: AD, MH, SM and ER. Generated structural and functional annotation: AD, NK, DBR and EC. Generated scripts for mathematical and statistical analysis and figures: AD, CR, AK, DBR, JC and EC. Interpreted the data and provided critical discussion: AD, CR, ER, DM, RF, CBM and EC. Wrote the manuscript: AD, RF, CBM, JC and EC. All authors edited the manuscript. All authors read and approved the final manuscript.
We would sincerely like to acknowledge Oliver Miotto for validating the correspondence between the isolates and the described subpopulations.
The authors declare that they have no competing interests.
Availability of data and materials
The full list of ENA data base entries for all the samples used, is provided in Additional file 1. The genome sequences were submitted by Wellcome Trust Sanger Institute and originate from three locations in western Cambodia (Pursat, Pailin and Tasanh) and one location in eastern Cambodia (Ratanakiri). The SNP catalog of 21,257 SNPs among 167 isolates is provided as Additional file 7.
Ethics approval and consent to participate
DBR and EC were supported by the ANR (Agence Nationale de la Recherche “Investissements d’avenir/Bioinformatique”): ANR-11-BINF-0002 (Institut de Biologie Computationnelle). AD was funded by Erasmus Mundus Action 2: Svaagata.eu project: India, funded by the European Commission (ref.nr Agreement Number: 2012-2648/001-001-EM Action 2-Partnerships).
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Dondorp AM, Nosten F, Yi P, Das D, Phyo AP, Tarning J, et al. Artemisinin resistance in Plasmodium falciparum malaria. N Engl J Med. 2009;361:455–67.View ArticlePubMedPubMed CentralGoogle Scholar
- Tun KM, Imwong M, Lwin KM, Win AA, Hlaing TM, Hlaing T, et al. Spread of artemisinin-resistant Plasmodium falciparum in Myanmar: a cross-sectional survey of the K13 molecular marker. Lancet Infect Dis. 2015;15:415–21.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang Z, Wang Y, Cabrera M, Zhang Y, Gupta B, Wu Y, et al. Artemisinin resistance at the China–Myanmar border and association with mutations in the K13 propeller gene. Antimicrob Agents Chemother. 2015;59:6952–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Miotto O, Amato R, Ashley EA, MacInnis B, Almagro-Garcia J, Amaratunga C, et al. Genetic architecture of artemisinin-resistant Plasmodium falciparum. Nat Genet. 2015;47:226–34.View ArticlePubMedPubMed CentralGoogle Scholar
- Ashley EA, Dhorda M, Fairhurst RM, Amaratunga C, Lim P, Suon S, et al. Spread of artemisinin resistance in Plasmodium falciparum malaria. N Engl J Med. 2014;371:411–23.View ArticlePubMedPubMed CentralGoogle Scholar
- Huang F, Takala-Harrison S, Jacob CG, Liu H, Sun X, Yang H, et al. A single mutation in K13 predominates in southern China and is associated with delayed clearance of Plasmodium falciparum following artemisinin treatment. J Infect Dis. 2015;212:1629–35.View ArticlePubMedPubMed CentralGoogle Scholar
- Amaratunga C, Witkowski B, Khim N, Menard D, Fairhurst RM. Artemisinin resistance in Plasmodium falciparum. Lancet Infect Dis. 2014;14:449–50.View ArticlePubMedPubMed CentralGoogle Scholar
- Takala-Harrison S, Jacob CG, Arze C, Cummings MP, Silva JC, Dondorp AM, et al. Independent emergence of artemisinin resistance mutations among Plasmodium falciparum in Southeast Asia. J Infect Dis. 2015;211:670–9.View ArticlePubMedGoogle Scholar
- Nyunt MH, Hlaing T, Oo HW, Tin-Oo LL, Phway HP, Wang B, et al. Molecular assessment of artemisinin resistance markers, polymorphisms in the k13 propeller, and a multidrug-resistance gene in the eastern and western border areas of Myanmar. Clin Infect Dis. 2015;60:1208–15.View ArticlePubMedGoogle Scholar
- Ménard D, Khim N, Beghain J, Adegnika AA, Shafiul-Alam M, Amodu O, et al. A worldwide map of Plasmodium falciparum K13-propeller polymorphisms. N Engl J Med. 2016;374:2453–64.View ArticlePubMedPubMed CentralGoogle Scholar
- Manske M, Miotto O, Campino S, Auburn S, Almagro-Garcia J, Maslen G, et al. Analysis of Plasmodium falciparum diversity in natural infections by deep sequencing. Nature. 2012;487:375–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Miotto O, Almagro-Garcia J, Manske M, MacInnis B, Campino S, Rockett KA, et al. Multiple populations of artemisinin-resistant Plasmodium falciparum in Cambodia. Nat Genet. 2013;45:648–55.View ArticlePubMedGoogle Scholar
- Dwivedi A, Khim N, Reynes C, Ravel P, Ma L, Tichit M, et al. Plasmodium falciparum parasite population structure and gene flow associated to anti-malarial drugs resistance in Cambodia. Malar J. 2016;15:319.View ArticlePubMedPubMed CentralGoogle Scholar
- Ariey F, Witkowski B, Amaratunga C, Beghain J, Langlois AC, Khim N, et al. A molecular marker of artemisinin-resistant Plasmodium falciparum malaria. Nature. 2014;505:50–5.View ArticlePubMedGoogle Scholar
- Imwong M, Suwannasin K, Kunasol C, Sutawong K, Mayxay M, Rekol H, et al. The spread of artemisinin-resistant Plasmodium falciparum in the Greater Mekong subregion: a molecular epidemiology observational study. Lancet Infect Dis. 2017;17:491–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Phyo AP, Ashley EA, Anderson TJC, Bozdech Z, Carrara VI, Sriprawat K, et al. Declining efficacy of artemisinin combination therapy against P. falciparum malaria on the Thai–Myanmar border (2003–2013): the role of parasite genetic factors. Clin Infect Dis. 2016;63:784–91.View ArticlePubMedPubMed CentralGoogle Scholar
- Mita T, Tanabe K. Evolution of Plasmodium falciparum drug resistance: implications for the development and containment of artemisinin resistance. Jpn J Infect Dis. 2012;65:465–75.View ArticlePubMedGoogle Scholar
- Mita T, Tanabe K, Kita K. Spread and evolution of Plasmodium falciparum drug resistance. Parasitol Int. 2009;58:201–9.View ArticlePubMedGoogle Scholar
- Anderson TJ, Haubold B, Williams JT, Estrada-Franco JG, Richardson L, Mollinedo R, et al. Microsatellite markers reveal a spectrum of population structures in the malaria parasite Plasmodium falciparum. Mol Biol Evol. 2000;17:1467–82.View ArticlePubMedGoogle Scholar
- Payne D. Did medicated salt hasten the spread of chloroquine resistance in Plasmodium falciparum? Parasitol Today. 1988;4:112–5.View ArticlePubMedGoogle Scholar
- Hastings IM. The origins of antimalarial drug resistance. Trends Parasitol. 2004;20:512–8.View ArticlePubMedGoogle Scholar
- Mbengue A, Bhattacharjee S, Pandharkar T, Liu H, Estiu G, Stahelin RV, et al. A molecular mechanism of artemisinin resistance in Plasmodium falciparum malaria. Nature. 2015;520:683–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Dogovski C, Xie SC, Burgio G, Bridgford J, Mok S, McCaw JM, et al. Targeting the cell stress response of Plasmodium falciparum to overcome artemisinin resistance. PLoS Biol. 2015;13:e1002132.View ArticlePubMedPubMed CentralGoogle Scholar
- Mok S, Ashley EA, Ferreira PE, Zhu L, Lin Z, Yeo T, et al. Drug resistance. Population transcriptomics of human malaria parasites reveals the mechanism of artemisinin resistance. Science. 2015;347:431–5.View ArticlePubMedGoogle Scholar
- Ariey F, Duchemin JB, Robert V. Metapopulation concepts applied to falciparum malaria and their impacts on the emergence and spread of chloroquine resistance. Infect Genet Evol. 2003;2:185–92.View ArticlePubMedGoogle Scholar
- Chang HH, Moss EL, Park DJ, Ndiaye D, Mboup S, Volkman SK, et al. Malaria life cycle intensifies both natural selection and random genetic drift. Proc Natl Acad Sci USA. 2013;110:20129–34.View ArticlePubMedPubMed CentralGoogle Scholar
- Hofree M, Shen JP, Carter H, Gross A, Ideker T. Network-based stratification of tumor mutations. Nat Methods. 2013;10:1108–15.View ArticlePubMedPubMed CentralGoogle Scholar
- Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Aurrecoechea C, Brestelli J, Brunk BP, Dommer J, Fischer S, Gajria B, et al. PlasmoDB: a functional genomic database for malaria parasites. Nucleic Acids Res. 2009;37:D539–43.View ArticlePubMedGoogle Scholar
- MalariaGEN Plasmodium falciparum community project. Genomic epidemiology of artemisinin resistant malaria. Elife. 2016;5:e08714.View ArticleGoogle Scholar
- Szklarczyk D, Franceschini A, Wyder S, Forslund K, Heller D, Huerta-Cepas J, et al. STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res. 2015;43:D447–52.View ArticlePubMedGoogle Scholar
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003;13:2498–504.View ArticlePubMedPubMed CentralGoogle Scholar
- Le Roch KG, Zhou Y, Blair PL, Grainger M, Moch JK, Haynes JD, et al. Discovery of gene function by expression profiling of the malaria parasite life cycle. Science. 2003;301:1503–8.View ArticlePubMedGoogle Scholar
- Bindea G, Mlecnik B, Hackl H, Charoentong P, Tosolini M, Kirilovsky A, et al. ClueGO: a Cytoscape plug-into decipher functionally grouped gene ontology and pathway annotation networks. Bioinformatics. 2009;25:1091–3.View ArticlePubMedPubMed CentralGoogle Scholar
- Bindea G, Galon J, Mlecnik B. CluePedia Cytoscape plugin: pathway insights using integrated experimental and in silico data. Bioinformatics. 2013;29:661–3.View ArticlePubMedPubMed CentralGoogle Scholar
- Lee DD, Seung HS. Learning the parts of objects by non-negative matrix factorization. Nature. 1999;401:788–91.View ArticlePubMedGoogle Scholar
- McGuffin LJ, Atkins JD, Salehe BR, Shuid AN, Roche DB. IntFOLD: an integrated server for modelling protein structures and functions from amino acid sequences. Nucleic Acids Res. 2015;43:W169–73.View ArticlePubMedPubMed CentralGoogle Scholar
- Kelley LA, Mezulis S, Yates CM, Wass MN, Sternberg MJ. The Phyre2 web portal for protein modeling, prediction and analysis. Nat Protoc. 2015;10:845–58.View ArticlePubMedPubMed CentralGoogle Scholar
- Källberg M, Wang H, Wang S, Peng J, Wang Z, Lu H, et al. Template-based protein structure modeling using the RaptorX web server. Nat Protoc. 2012;7:1511–22.View ArticlePubMedPubMed CentralGoogle Scholar
- Wu S, Zhang Y. LOMETS: a local meta-threading-server for protein structure prediction. Nucleic Acids Res. 2007;35:3375–82.View ArticlePubMedPubMed CentralGoogle Scholar
- McGuffin LJ, Roche DB. Rapid model quality assessment for protein structure predictions using the comparison of multiple models without structural alignments. Bioinformatics. 2010;26:182–8.View ArticlePubMedGoogle Scholar
- Zhang Y, Skolnick J. TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Res. 2005;33:2302–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Hu M, Nandi S, Davies C, Nicholas RA. High-level chromosomally mediated tetracycline resistance in Neisseria gonorrhoeae results from a point mutation in the rpsJ gene encoding ribosomal protein S10 in combination with the mtrR and penB resistance determinants. Antimicrob Agents Chemother. 2005;49:4327–34.View ArticlePubMedPubMed CentralGoogle Scholar
- Spitzmüller A, Mestres J. Prediction of the P. falciparum target space relevant to malaria drug discovery. PLoS Comput Biol. 2013;9:e1003257.View ArticlePubMedPubMed CentralGoogle Scholar
- Wentzinger L, Bopp S, Tenor H, Klar J, Brun R, Beck HP, et al. Cyclic nucleotide-specific phosphodiesterases of Plasmodium falciparum: PfPDE alpha, a non-essential cGMP-specific PDE that is an integral membrane protein. Int J Parasitol. 2008;38:1625–37.View ArticlePubMedGoogle Scholar
- Hopp CS, Bowyer PW, Baker DA. The role of cGMP signaling in regulating life cycle progression of Plasmodium. Microbes Infect. 2012;14:831–7.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang J, Huang L, Li J, Fan Q, Long Y, Li Y, et al. Artemisinin directly targets malarial mitochondria through its specific mitochondrial activation. PLoS ONE. 2010;5:e9582.View ArticlePubMedPubMed CentralGoogle Scholar
- Feng W, Zhang W, Wang H, Ma L, Miao D, Liu Z, et al. Analysis of phosphorylation sites on autophagy proteins. Protein Cell. 2015;6:698–701.View ArticlePubMedPubMed CentralGoogle Scholar
- Mizushima N, Yoshimori T, Ohsumi Y. The role of Atg proteins in autophagosome formation. Annu Rev Cell Dev Biol. 2011;27:107–32.View ArticlePubMedGoogle Scholar
- Kroemer G, Mariño G, Levine B. Autophagy and the integrated stress response. Mol Cell. 2010;40:280–93.View ArticlePubMedPubMed CentralGoogle Scholar
- Matsui A, Kamada Y, Matsuura A. The role of autophagy in genome stability through suppression of abnormal mitosis under starvation. PLoS Genet. 2013;9:e1003245.View ArticlePubMedPubMed CentralGoogle Scholar
- Kaplon J, Dam LV, Peeper D. Two-way communication between the metabolic and cell cycle machineries: the molecular basis. Cell Cycle. 2015;14:2022–32.View ArticlePubMedPubMed CentralGoogle Scholar
- Altman BJ, Rathmell JC. Metabolic stress in autophagy and cell death pathways. Cold Spring Harb Perspect Biol. 2012;4:a008763.View ArticlePubMedPubMed CentralGoogle Scholar
- Zhang SW, Feng JN, Cao Y, Meng LP, Wang SL. Autophagy prevents autophagic cell death in Tetrahymena in response to oxidative stress. Zool Res. 2015;36:167–73.PubMedPubMed CentralGoogle Scholar
- Solyakov L, Halbert J, Alam MM, Semblat JP, Dorin-Semblat D, Reininger L, et al. Global kinomic and phospho-proteomic analyses of the human malaria parasite Plasmodium falciparum. Nat Commun. 2011;29:565.View ArticleGoogle Scholar
- Paloque L, Ramadani AP, Mercereau-Puijalon O, Augereau JM, Benoit-Vical F. Plasmodium falciparum: multifaceted resistance to artemisinins. Malar J. 2016;9:149.View ArticleGoogle Scholar
- Wright S. The theoretical variance within and among subdivisions of a population that is in a steady state. Genetics. 1952;37:312–21.PubMedPubMed CentralGoogle Scholar
- Bray P, Martin RE, Tilley L, Ward S, Kirk K, Fidock DA. Defining the role of PfCRT in Plasmodium falciparum chloroquine resistance. Mol Microbiol. 2005;56:323–33.View ArticlePubMedGoogle Scholar
- Juge N, Moriyama S, Miyaji T, Kawakami M, Iwai H, Fukui T, et al. Plasmodium falciparum chloroquine resistance transporter is a H+-coupled polyspecific nutrient and drug exporter. Proc Natl Acad Sci USA. 2015;112:3356–61.View ArticlePubMedPubMed CentralGoogle Scholar
- Briolant S, Bogreau H, Gil M, Bouchiba H, Baret E, Amalvict R, et al. The F423Y mutation in the pfmdr2 gene and mutations N51I, C59R, and S108 N in the pfdhfr gene are independently associated with pyrimethamine resistance in Plasmodium falciparum isolates. Antimicrob Agents Chemother. 2012;56:2750–2.View ArticlePubMedPubMed CentralGoogle Scholar
- Triglia T, Menting JG, Wilson C, Cowman AF. Mutations in dihydropteroate synthase are responsible for sulfone and sulfonamide resistance in Plasmodium falciparum. Proc Natl Acad Sci USA. 1997;94:13944–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Gaillard T, Madamet M, Pradines B. Tetracyclines in malaria. Malar J. 2015;14:445.View ArticlePubMedPubMed CentralGoogle Scholar
- Phimda K, Hoontrakul S, Suttinont C, Chareonwat S, Losuwanaluk K, Chueasuwanchai S, et al. Doxycycline versus azithromycin for treatment of leptospirosis and scrub typhus. Antimicrob Agents Chemother. 2007;51:3259–63.View ArticlePubMedPubMed CentralGoogle Scholar