The association of CYP2D6 gene polymorphisms in the full-length coding region with higher recurrence rate of vivax malaria in Yunnan Province, China

Accumulating evidence suggest that compromised CYP2D6 enzyme activity caused by gene mutation could contribute to primaquine failure for the radical cure of vivax malaria. The current study aims to preliminarily reveal the association between the recurrence of vivax malaria in Yunnan Province and CYP2D6 gene mutation by analysing polymorphisms in the entire coding region of human CYP2D6 gene. Blood samples were collected from patients with vivax malaria, who received "chloroquine and 8-day course of primaquine therapy" in Yunnan Province. The suspected relapsed cases were determined by epidemiological approaches and gene sequence alignment. PCR was conducted to amplify the CYP2D6 gene in the human genome, and the amplified products were then sequenced to compare with the non-mutation “reference” sequence, so as to ensure correct sequencing results and to determine 9 exon regions. Subsequently, the DNA sequences of 9 exons were spliced into the coding DNA sequence (CDS), which, by default, is known as maternal CDS. The paternal CDS was obtained by adjusting the bases according to the sequencing peaks. The mutation loci, haplotypes (star alleles), genotypes and odds ratios (OR) of all the CDSs were analysed. Of the119 maternal CDS chains in total with 1491 bp in length, 12 mutation sites in the 238 maternal and paternal CDS chains were detected. The c.408G > C mutation was most frequently detected in both suspected relapsed group (SR) and non-relapsed group (NR), reaching 85.2% (75/88) and 76.0% (114/150), respectively. The c.886C > T mutation was most closely related to the recurrence of vivax malaria (OR = 2.167, 95% CI 1.104–4.252, P < 0.05). Among the 23 haplotypes (Hap_1 ~ Hap_23), Hap_3 was non-mutant, and the sequence structure of Hap_9 was the most complicated one. Five star alleles, including *1, *2, *4, *10 and *39, were confirmed by comparison, and CYP2D6*10 allele accounted for the largest percentage (45.4%, 108/238). The frequency of CYP2D6*2 allele in the SR group was significantly higher than that in the NR group (Χ2 = 16.177, P < 0.05). Of the defined 24 genotypes, 8 genotypes, including *4/*4, *4/*o, *2/*39, *39/*m, *39/*x, *1/*r, *1/*n, and *v/*10, were detected only in the SR group. Mutation of CYP2D6*10 allele accounts for the highest proportion of vivax malaria cases in Yunnan Province. The mutations of c. 886C > T and CYP2D6*2 allele, which correspond to impaired PQ metabolizer phenotype, are most closely related to the relapse of vivax malaria. In addition, the genotype *4/*4 with null CYP2D6 enzyme function was only detected in the SR group. These results reveal the risk of defected CYP2D6 enzyme activity that diminishes the therapeutic effect of primaquine on vivax malaria.


Background
In recent years, the global malaria epidemic has gradually scaled down, yet the death tolls remain high and the number of malaria cases in certain areas has increased [1]. Meanwhile, multiple technological restrains, including the difficulty to identify low-density Plasmodium infection [2], the expansion of drug resistant Plasmodium [3,4], and the lack of new drugs in the anti-malarial pipeline [5], pose daunting challenges to the elimination of malaria. Although the local cases of malaria have not been reported since 2016 [6], China still needs to cope with a large number of imported malaria cases, especially from Africa and Southeast Asian countries [7]. From 2011 to 2018, 50.7% of the imported malaria cases in Yunnan Province were caused by Plasmodium vivax [8]. Therefore, the elimination of malaria calls for the strict enforcement to block the transmission of P. vivax.
Tafenoquine and primaquine are recommended by the World Health Organization (WHO) for the treatment of relapsed malaria owing to its efficiency of eradicating the hypnozoites of P. vivax via 5-hydroxy-primaquine (5-hydroxy-primaquine) formed by the enzyme CYP2D6 (Cytochrome P450, family 2, subfamily D, polypeptide 6) in human liver cells [9][10][11][12]. CYP2D6, also known as debriso-quine4-hydroxylase, is an isozyme in the Cytochrome P450 (CYP450) super family and an important phase I drugmetabolizing enzyme. Owing to its low capacity and high affinity, CYP2D6 is involved in the metabolism of ~ 30% of commonly used drugs, including primaquine although it only accounts for 4% of the total P450 enzyme protein in the liver [12]. Studies have found that the heterogeneity of CYP2D6 activity is predominantly governed by its genetic variation [13], and that decreased CYP2D6 isoenzyme activity caused by genetic polymorphisms would obstruct the generation of 5-hydroxy-primaquine, making large doses or repeated use of primaquine necessary to compensate for the declined efficacy of primaquine. Meanwhile, primaquine can cause life threatening haemolysis in patients with glucose-6-phosphate dehydrogenase (G6PD) deficiency, subsequently inflicting low efficacy in anti-relapse and acute hemolysis [9,14,15]. Therefore, the risk of G6PD deficiency should be identified prior to receiving anti-relapse treatment of vivax malaria, and that the genotype and enzyme activity of CYP2D6 should be tested as well [14,16,17]. CYP2D6 enzyme activity is measured mainly through phenotypic observation and genotypic analysis. Compared with phenotypic characterization, genotypic prediction manifests the advantages of greater stability, more convenience and being less affected by environmental or physiological factors [18][19][20]. CYP2D6 enzyme activity can be detected from the blood sample of the patients; genotypic sequencing is currently the common technical method to determine the allelic form of CYP2D6 gene [21][22][23][24][25] by identifying the locus status of the CYP2D6 gene [13] or allelic combinations [26,27], thereby predicting the activity level of CYP2D6. However, as the number of alleles of CYP2D6 gene continues to increase, efforts should be made to improve the precision and inclusiveness of genotyping results [28]. The better understanding of the genetic polymorphism of CYP2D6, which metabolizes about 25% of all medications in the human liver, can not only help deal with individual adverse event and but also enable the more precise DNA genotyping-based prediction.
Primaquine is commonly used in the malaria endemic areas of Yunnan Province, yet the systematic survey on the genetic polymorphisms of CYP2D6 has been not conducted. In this study, Whole Genome Sequencing (WGS) was conducted to investigate the polymorphisms of CYP2D6 gene in the vivax malaria cases, thereby revealing the association between human CYP2D6 gene polymorphisms and the relapsed cases of vivax malaria.

Study subjects and sampling
A total of 120 subjects living in Yunnan province, China (97°31′ E to 106°11′ E; 21°8′ N to 29°15′ N) were enrolled in the study. The subjects were diagnosed of mono-infection of P. vivax by both microscopy examination and Plasmodium 18S rRNA gene detection (PCR) in Yunnan Province Malaria Diagnosis Referent Laboratory (YPM-DRL) (Additional file 1). All the patients received oral chloroquine therapy (1550 mg in total) within 3 days, followed by the subsequent 8-day course of primaquine therapy (22.5 mg/day).
The subjects diagnosed from 2014 to 2018 were subdivided into suspected relapse of vivax malaria (SR group) and non-relapse of vivax malaria (NR group). The inclusion criteria of the SR group are: (1) the patients were re-diagnosed as P. vivax infection within the interval of 28-180 days after the initial diagnosis; (2) epidemiological investigation confirmed that the patients were not re-exposed to malaria endemic areas between the two malaria episodes; and (3) the comparative sequencing of CSP (Circumsporozoite Surface Protein) and MSP-1 (Merozoite Surface Protein 1) of P. vivax with those of the previously infected strains found 100% similarity and 0 variable sites.
A simple random sampling method was conducted to selected 75 NR cases from the 143 patients confirmed in 2018. The subjects in the NR group had no history of relapsed vivax malaria after 1 year of follow-up. The source of vivax malaria infection was determined through epidemiological investigations. The subjects had no travelling history to malaria endemic areas within 30 days before the onset of malaria were regarded as local cases, otherwise they were regarded as imported cases. The study was approved by Yunnan Institute of Parasitic Diseases and Ethical Committee document No. 201904 (January 25, 2019). Genotyping was performed by using the stored blood samples obtained from the suspected patients who experienced fever and chills. Informed consent was obtained from all the patients. The blood samples of the subjects were provided by the Centers for Disease Control and Prevention in Yunnan Province, according to the described protocol [29]. Dried blood spots (DBS) sampling was conducted by collected 0.6 ml of venous blood, which was stored in a dried tube at − 80 ℃ before the extraction of DNA.

Extraction of genomic DNA
Three dried blood spots (diameter = 5 mm) were made, and genomic DNA was extracted according to the manufacturer's instructions of the QIAgen Mini Kit (Germany, QIAamp Company's DNA Mini Kit), according to the manufacture's instruction. The extracted DNA was stored at − 20 ℃.
The primers that cover exons 5-9 of CYP2D6 gene were: 5′-GCC GAC TGA GCC CTG GGA GGT AGG TA-3′ and 5′-GCT GGG GCC TGA GACTT-3. The amplification area was 42,126,035-42,128,422, and the amplification products had an expected fragment length of 2388 bp. For all the PCR reaction systems, we used 2.6 μl DNA template, 14.0 μl 2 × Taq PCR hybrid system (QIAGEN, Germany), 0.7 μl upstream primer (20umol/L), 0.7 μl downstream primers (20umol/L). The total volume was adjusted to 25.0 μl with ddH 2 O. PCR reaction conditions were clarified as follows: 92 to 94 °C for 2-5 min; 92 to 94 °C for 10-30 s, 50 to 56 °C for 15-30 s, 68 to 72 °C for 2 to 3 min, 35 cycles; 68 to 72 °C for 7 min. The triplicated parallel repetition was adopted for each PCR reaction. The amplified products were observed using 1.5% agarose gel electrophoresis. The positive amplification products were sent to Shanghai Meiji Biomedical Technology Co, Ltd. for sequencing by using the Sanger method. Only the sequences that showed identical results in at least two tests were used for subsequent analysis.

CYP2D6 gene mutations analysis
The sequencing reads were aligned by using DNAStar and BioEdit software packages. The obtained DNA sequences were assessed by using the Basic Local Alignment Search Tool (BLAST, http:// blast. ncbi. nlm. nih. gov/ Blast. cgi) on the NCBI platform. The sequences having a homology of 100% and query cover of above 99% with the non-mutated "reference" sequence (ID: NC_000022.11) were considered as the human CYP2D6 gene sequence.
The sequence was further compared with the non-mutation reference sequence to identify 9 exon regions, the transcriptional starting point and the end point of CYP2D6 gene. The DNA sequences of the 9 exon regions were spliced into the maternal CDS (coding DNA sequence) of CYP2D6 gene in the order from exon 1 to exon 9. The paternal CDS was obtained by replacing the undisplayed diploid bases on the maternal CDS chain [31,32], and the diploid bases were determined by searching the sequencing peaks (Additional file 2). This approach makes up for the shortcoming that the sequencer can only read one base signal for each site, whereas misses the other hybrid base. MEGA v5.04 software was used to analyse the missense mutations and synonymous mutation sites of maternal and paternal CDS, and the ID number of the defined mutation site was queried in Genbank. DnaSP 6.11.01 software was used to identify the haplotype of the CDS, to calculate the nucleic acid diversity index (π), expected heterozygosity (He) and other parameters [33].
The haplotype (allelic form) of CDSs was determined according to the criteria of Human Cytochrome P450 Allele Nomenclature Committee (NM_000106.6) [34] and previous study [35], and were termed to the level of sub-alleles (such as *10.001 and *39.001). The sub-alleles were then merged. Those haplotypes unmatched with known CYP2D6 alleles were grouped into the "other" category. The genotype of each sample was a diploid composed of the maternal CDS and parental CDS (such as *10/*39 and *1/*10). SPSS software (version 21.0; IBN; Chicago; IL) was used to conduct chi-square test on the frequency of genotypes and mutation loci co-existed in the SR cases and NR cases. The odds ratio (OR) of genotypes and mutation loci with statistical significance were calculated by comparing to the relapse rate of P. vivax. The significance level was set as P < 0.05; OR < 1 indicates that the haplotype is a protective factor, while OR < 1 indicates a risk factor.

Demographics and clinical characteristics of the subjects and PCR amplification of human CYP2D6 gene
A total of 45 SR vivax malaria cases were confirmed by epidemiological investigation and gene sequence alignment (Additional file 3), including 43 cases with one suspected relapsed event, 1 case with two suspected relapsed events and 1 case with three suspected relapsed events. The demographic and clinical characteristics of the subjects are listed in Table 1. The male-female ratio was 3:1. Most of the patients presented only one suspected relapsed episode (95.6%).
Exons 1-4 and 5-9 of CYP2D6 gene in the blood samples collected from 45 SR patients and 75 NR patients were amplified by conducting PCR. The amplified product revealed a clear band at > 2000 bp, and therefore can be considered as the target band (Additional file 4). The CYP2D6 gene fragments of 75 NR patients and 44 SR patients were successfully amplified, expect for the PCR product of CYP2D6 gene was not effectively amplified for 1 SR patient.

Locus polymorphisms of CDS and its association with relapsed cases of vivax malaria
A total of 119 maternal CDS of CYP2D6 gene (Genbank accession number: MT339075-MT339193) were obtained. Base substitutions at 12 loci, including c.31 and c.100, on the 238 maternal and paternal CDS, were determined ( Table 2). The proportions of third-base and first-base substitution in the triplet codon were 41.7% (5/12), and the proportion of second-base substitution was 16.6% (2/12). 7 missense mutation loci and 5 synonymous mutation loci were determined. 91.7% (11/12) of the mutation sites were found in the sequences of SR group, while 66.7% (8/12) of the mutation sites were determined in the sequences of NS group ( Table 2).
The comparative results of 12 single nucleotide polymorphisms (SNPs) between the two groups were listed in Table 2. In the SR group, the frequency of mutation allele was the highest at c.1457 G > C (85.2%, 75/88), and c. 408 G > C (85.2%,75/88). In the NR group, the highest frequency of mutation was at c. 408 G > C (76.0%, 114/150), followed by c. 1457 G > C (72.0%, 108/150). Eight mutation loci found both in SR group and NR group were selected to analyse the association between loci mutation and vivax malaria relapse. Correlation analysis showed that the ratio of c.886C > T site mutation to the occurrence of vivax malaria recurrence was 2.167 (95% CI 1.104-4.252), and that statistically significance was determined (P˂ 0.05). Such result suggests that c.886 mutation increases the risk of recurrence of vivax malaria, which is 2.167 times higher than that of other SNPs (Table 2).

Haplotype diversity
A total of 23 haplotypes (Hap_1 ~ Hap_23) were identified amongst the 238 CDSs of CYP2D6 gene. Of them, 9 haplotypes were only observed in the sequences of SR group (π = 0.0015, He = 0.821), while 6 haplotypes were only found in the NR group (π = 0.0014 and He = 0.760). 8 haplotypes, including Hap_2, Hap_3, Hap_4, Hap_5, Hap_6, Hap_7, Hap_13 and Hap_14, were found in both SR and NR group. The compositions of every haplotype were shown in Fig. 1. Hap_3 did not contain mutation loci, and the rest haplotypes contained different mutation loci.

Discussion
Located on chromosome 22q13.1, CYP2D6 gene (4383 bp in length) consists of 8-9 exons and 7-8 introns. Its CDS is 1338-1491 bp in length and encodes 446-497 amino acids. The heterogeneity of CYP2D6 monooxygenase activity varies from complete dysfunction to ultra-rapid enzyme metabolism, depending on the various genotypes of CYP2D6 [14]. Among the 150 identified alleles, multiple mutations such as Table 3 The association between CYP2D6 genotypes and the relapse of vivax malaria n1 number of chromosomes, n2 number of cases, F Frequency Other: *m, *n, *o, *p, *q,*r,*t, *u, *v, *w and *x; a They could not meet the allele inclusion criteria provided by the Allele Nomenclature Committee [32]; b Chi-square test; NS: not significant (P > 0.05); c The OR value of genotypes were calculated by comparing to the relapse rate of P. vivax; S: significant (P < 0.05); NS: not significant (P > 0.05)

Alleles
Total CYP2D6*3, CYP2D6*4, CYP2D6*5 and CYP2D6*6 are considered to represent null CYP2D6 enzyme activity [30,32,33,36]. The current study is the first one that attempts to analyse CYP2D6 polymorphisms in the entire coding region sequence in the blood specimen of relapsed vivax malaria patients in Yunnan. The current study reveals the association between CYP2D6 polymorphisms and thwarted CYP2D6 enzyme activity, which is accountable to the failed treatment of using primaquine. This study found that mutation at c. 886 locus of CYP2D6 gene elevated the risk of vivax malaria relapse risk by 2.167-fold (P < 0.05). In the findings of Wang et al. [37] and Gaedigk et al. [38], it is concluded that the relapse of vivax malaria may be relate to c.886. locus mutation of CYP2D6 gene in Yunnan Province.
Among the 23 alleles defined by the 23 haplotypes in all the CDSs, CYP2D6*10 accounts for the highest proportion, reaching as high as 45.4% (108/238). Consistently, previous studies found the highest frequency of CYP2D6*10 alleles mutation in the Chinese Han population and in Asian subjects [31,[39][40][41]. Although the frequency of CYP2D6*10 alleles did not show statistical significant difference between the SR group and the SR group, CYP2D6*10 mutation was considered to indicate decreased enzyme activity [24,34]. This notion suggested that the wide distribution of CYP2D6 gene*10 alleles in the population of Yunnan should not be disregarded, as it will amplify the accumulated risk of reduced therapeutic efficacy of primaquine in eliminating P. vivax hypnozoites in Yunnan Province.
The current study found that CYP2D*2 allele was more frequently determined in the relapsed patients. This result was not in agreement with that of Brasil et al. [30], which reported that the frequency of CYP2D*2 mutation was significantly higher in the non-relapsed patients. Such inconsistent results might be attributed to the different subjects enrolled in the two studies. Specifically, the subjects included in this study were vivax malaria patients who live in the malaria endemic areas in Yunnan Province, hence the long-term malaria epidemic environment could positively screen the human defective genes [42][43][44].
In this study, all the alleles were identified based on the mutations in the exon regions of CYP2D6, which is different from allelic recognition that covers intron mutations [32,45]. Therefore, the homogeneity of CYP2D6*2 allele in the two studies could not be ruled out. Although CYP2D6*2 allele is generally considered to be an allelic type of CYP2D6 normal enzyme activity, multiple mutation that contains c. 886 locus could result in lowered activity of CYP2D6 enzyme [38,42]. At present, the complicated multi-directional change of CYP2D6 enzyme activity caused by *2 allelic mutation is yet to be validated [35,37,38,45].
Eight mutant allelic types were found only in the SR group, yet the association between these different CYP2D6 genotypes and the relapse of vivax malaria could not be established. Given that the *10/*10 genotype still accounted for the largest proportion (30.3%) in the overall study cohort and the no-function *4/*4 genotype was found in the SR group [39], the risk of declined therapeutic efficacy of primaquine caused by defective CYP2D6 enzyme activity in malaria endemic areas should not be ignored. In addition, effort should be made to validate whether the one case of amplification failure case in this study was caused by full gene deletion (*5/*5) or other factors.
The limitations of the current study should be noted. Firstly, the sample size of suspected recurrence of vivax malaria is not large enough, and the alleles (*4, *m-*x,) of CYP2D6 genes with low frequency of distribution might not be quantitatively analyzed (Additional file 5). Secondly, while the re-infection with P. vivax was ruled out from the SR group, the influences of primaquine doses and primaquine resistance were not evaluated. Finally, polymorphisms in the entire coding region of CYP2D6 gene were investigated yetthe possible mutation in intron region and the copy number of CYP2D6 gene, which has been considered to disturb the activity of CYP2D6 enzyme, were not analysed. Thusly, the activity of CYP2D6 cannot be indirectly predicted based on the results of CYP2D6 genotyping in the present study. In view of such shortcoming, the sampling size should be expanded to conduct molecular epidemiological investigations on CYP2D6 genotype.
Currently, the number of CYP2D6 gene CDS has increased to 320, and consistent results pertaining to the loci mutation and allelic type were found. Moreover, multiplex ligation-dependent probe amplification (MLPA) and other advanced methods will be applied to optimize the accurate identification of human CYP2D6 genotypes, so as to swiftly predict CYP2D6 enzyme activity of individual subjects. These findings will contribute to the efficacy of primaquine in the treatment of vivax malaria in Yunnan province.

Conclusions
This pilot study reveals the polymorphisms of CYP2D6 gene coding region and its association with the recurrence of vivax malaria in the population of Yunnan Province, suggesting that the radical cure of vivax malaria by using primaquine could be hampered due to the impaired CYP2D6 enzyme activity in the population of Yunnan Province. The study revealed that (1) CYP2D6*10 allele accounts for the highest proportion in the vivax malaria cases of Yunnan Province; (2) mutation at c. 886 locus is most closely related to the relapse of vivax malaria and CYP2D6*2 allele containing this mutation locus was significantly higher in SR; and (3) The *4/*4 genotype that indicates null CYP2D6 enzyme function (poor metabolizers) was only found in the SR cases.