Mitochondrial gene sequence variants in children with severe malaria anaemia with or without lactic acidosis: a case control study

Background Evolutionary pressure by Plasmodium falciparum malaria is known to have favoured a large number of human gene adaptations, but there is surprisingly little investigation of the effect of malaria on human mitochondrial sequence variation. Plasmodium falciparum infection can cause severe malaria anaemia (SMA) with insufficient tissue oxygenation, lactic acidosis and death. Despite equal degrees of severe anaemia, some individuals develop lactic acidosis while others do not. A case–control study design was used to investigate whether differences in host mitochondrial gene sequences were associated with lactic acidosis in SMA. Full mitochondrial sequences were obtained from 36 subjects with SMA complicated by lactic acidosis and 37 subjects with SMA without lactic acidosis. The two groups were matched for age, sex, and degree of anaemia. Results Compared with the reference sequence, a median of 60 nucleotide variants per individual (interquartile range 4–91) was found, with an average frequency of 3.97 variants per 1000 nucleotides. The frequency and distribution of non-synonymous DNA variants in genes associated with oxidative phosphorylation were not statistically different between the two groups. Non-synonymous variants predicted to have the most disruptive effect on proteins responsible for oxidative phosphorylation were present at a similar frequency in both groups. Conclusions Lactic acidosis in SMA does not appear to be consistently associated with the high prevalence of any mitochondrial gene variant.

it has been shown that lactic acidosis can result from severe anaemia with inadequate tissue oxygenation, and that lactic acidosis is relieved by transfusion of red blood cells with restoration of tissue oxygenation [5], there is no recognized explanation for the observation that some children develop lactic acidosis with SMA while others do not. Because oxygen-dependent bioenergetics occurs within mitochondria and relies in part on proteins encoded by mitochondrial genes, this study was designed to explore whether genetic differences in host mitochondria might account for differences in tolerance to the decreased tissue oxygenation that accompanies SMA. To investigate whether or not mitochondrial sequence variation might be associated with the development of lactic acidosis in severe anaemia, a matched case-control study among children with SMA was performed to compare the entire mitochondrial genome among severely anemic children with and without lactic acidosis.

Patient selection
Patient blood samples for mitochondrial DNA sequencing were collected during the course of two prior studies [3,5]. All samples were obtained from children aged 6 months to 5 years presenting for care to the paediatric acute care unit of Mulago Hospital in Kampala, Uganda. The hospital serves an indigent local community population which is ancestrally homogeneous with over 75% of mothers belonging to one of three tribes (Muganda 63%, Musoga 8%, and Munyankole 9%). All samples came from children with confirmed P. falciparum infection and with haemoglobin values ≤ 5 g/dL.
Blood lactate and haemoglobin levels were measured using point-of-care devices (LactatePro, Arkray; Hemocue 201) at the time of sample collection and prior to blood transfusion. Samples were frozen immediately after phlebotomy and stored in cryovials at − 20 °C or colder prior to testing. All samples were collected and stored with informed consent of the parent or guardian. Approval for sample collection, material transfer, and DNA sequencing was provided by the research ethics committee of Makerere University in Kampala, Uganda and by the Uganda National Council of Science and Technology.
From the above archive of frozen samples, two groups were selected based on the blood lactate level measured at the time of original phlebotomy: one group (n = 50) with blood lactate levels > 10 mM and one group with blood lactate levels < 5 mM (n = 50). Samples from the two groups were deliberately matched for patient age, gender, and haemoglobin concentration. Only patients who tested negative for the presence of haemoglobin S were included.

DNA extraction
For each patient, DNA was purified from a frozen blood sample using the Autopure LS instrument following the manufacturer's protocol (AutoGen, MA, USA). An additional purification was performed using the ZR-96 DNA Clean & Concentrator-5 kit (Zymo Research, CA, USA) to reduce heparin contamination. Purified DNA samples were then quantified using the Quant-iT PicoGreen dsDNA assay kit (Invitrogen, CA, USA) and measurements were made on an ultraviolet spectrophotometer (Gemini XPS).

DNA library preparation
Library preparation for each DNA sample was performed by long-range PCR followed by Nextera XT (Illumina, CA, USA) as described in the Human mtDNA Genome protocol for the Illumina Sequencing Platform [7]. Briefly, two long PCR amplicons were generated using two sets of primers (MTL-F1 and MTL-R1, MTL-F2, and MTL-R2) designed to encompass the entire human mitochondrial genome. The first set of primers generated a 9.1 kb fragment using primers MTL-F1 (5′-AAA GCA CAT ACC AAG GCC AC-3′) and MTL-F2 (5′-TTG GCT CTC CTT GCA AAG TT-3′). The second set of primers generated an 11.2 kb fragment using primers MTL-F2 (5′-TAT CCG CCA TCC CAT ACA TT-3′) and MTL-R2 (5′-AAT GTT GAG CCG TAG ATG CC-3′). For each patient DNA, two long-range PCR amplifications were performed using the TaKaRa LA TaqDNA Polymerase kit with a DNA input of 20 ng. Post-PCR products were quantified as described above and fragment sizes were verified using the DNA 12000 Bioanalyzer kit (Agilent, CA, USA).
The two amplicons for each patient were pooled and libraries were generated using the NexteraXT protocol (Illumina, CA, USA) automated on the Sciclone (Caliper/Perkin Elmer) with a DNA input of 1 ng. The Nex-teraXT library prep kit is a tagmentation process that fragments and adds adaptors to those fragments using an engineered transposome. The tagmented DNA is then amplified using a limited-cycle PCR program which adds Illumina adapters and unique indexes for each patient amplicon pool. After PCR, samples are purified using Ampure XP beads (Beckman Coulter, MA, USA).
Final libraries were quantified and fragment sizes were determined using High Sensitive D1000 Screentapes on the TapeStation instrument (Agilent, CA, USA). The expected size of final libraries is in the range of 500-1000 bp. Accurate measurement of sequence ready libraries was performed using the KAPA Library Quantification Kit (KAPA Biosystems, MA, USA) and run on the 7500 Fast Real-Time PCR System (Applied Biosystems, CA, USA).

Sequencing
The uniquely indexed patient libraries were pooled at equimolar concentrations based upon KAPA assay measurements, and the final pool was loaded at 10 pM (with 5% PhiX spike-in) on the Miseq instrument (Illumina, CA, USA) using a 600 cycle v3 Miseq kit (Illumina) to generate 300 bp paired end sequenced reads.

Variant calling and coverage
Raw sequencing data were analysed using the BaseSpace mtDNA Variant Processor App (Illumina), a commercially available web-based analysis pipeline. Briefly, this tool performs alignment and variant calling of samples against a reference mitochondrial DNA (mtDNA) genome, generating BAM alignment files and variant call files (VCF) for each of the samples. The analysis was restricted to samples with > 90% coverage of the entire mitochondrial sequence at 10× reads or greater. Additional information on software components of this analysis pipeline can be found online (https ://suppo rt.illum ina.com/downl oads/bases pace-mtdna -varia nt-proce ssorapp.html).

Variant annotations
Variant annotations were performed using MSeqDR: the Mitochondrial Disease Sequence Data Resource Consortium's tool called MvTool and VariantOneStop (https ://mseqd r.org/). Briefly, VCF files were processed to the required format by using python scripts and run through mvTool to get variant annotations in csv format. mvTool converts dozens of mtDNA variant formats into a standard revised Cambridge Reference Sequence (rCRS)-based HGVS format.
In addition, annotations generated by Varian-tOneStop include multiple-population frequencies from Mitomap and HmtDB, as well as annotations from Ensembl VEP, Mutalyzer, ClinVar, and from the MSe-qDR database. Functional predictions for missense variants were curated from MitImpact2 database which provides pre-computed pathogenicity predictions for mitochondrial variants (http://mitim pact.css-mende l.it/) and includes predictions from MToolBox, Poly-Phen2, MutationVariant Taster, and SIFT tools.
Haplogroup assignment for each individual was performed using the mtDNA-Server analysis software (https ://mtdna -serve r.uibk.ac.at/index .html), which utilizes the HaploGrep tool to determine haplogroup of mtDNA profiles (http://haplo grep.uibk.ac.at/). Bam files for each sample were uploaded directly into mtDNA-Server for this analysis.

Statistical analysis
For each group the median number of variants per person relative to the Cambridge Reference Sequence was compared using the Wilcoxon test; and the average number of variants per 1000 nucleotides was compared using the Student's t-test. The Fisher's test was used to compare the frequency of disruptive non-synonymous variants in the NADH dehydrogenase gene cluster. An alpha value of 0.05 was used to assess statistical significance. No corrections were made for multiple comparisons.

Results
From the original set of 100 archived samples, 7 were eliminated due to low DNA isolation and an additional 20 were eliminated due to low coverage after sequencing, leaving 73 subjects (36 with high blood lactate levels and 37 with normal lactates) for final analysis. See "Methods" for details. Clinical characteristics of the comparison groups are shown in Table 1. All patients had active P. falciparum malaria and met standard criteria for SMA. The two groups were well-matched for age and sex. Of particular importance, the two groups were matched for haemoglobin concentration and oxygen saturation, determinants of oxygen delivery, but had blood lactate levels that differed by more than fivefold.

Haplogroup assignments
Mitochondrial haplogroup assignments were made as described in the Methods. As expected in a population of individuals with common ancestry from East Africa, haplogroups L0-L4 were most commonly found. Because all subjects came from a common ancestral background, there was no difference in the distribution of haplogroups between the two groups see Table 2.

Mitochondria DNA sequence variation compared with the Cambridge Reference Sequence
In the total group of 73 individuals tested, 4770 nucleotide variants were observed when compared with the revised Cambridge Reference Sequence resulting in an average frequency per individual of 3.97 variants per 1000 nucleotides (see Table 3). The median (IQR) number of variants/person for the entire mitochondrial sequence in the high lactate and normal lactate group was 57.5 (42-93) and 61 (45-90), respectively. The distribution of variants for the genes associated with oxidative phosphorylation was not statistically different between the two groups (see Fig. 1, left panel). Most of the variants occurred in more than one individual, and thus the number of unique variants identified in the entire cohort was 665 variants. The majority of these (565/665) were synonymous variants, while the remaining variants were non-synonymous variants expected to result in missense changes at the amino acid level. In-frame indels and lossof-function (nonsense, frameshift) variants were not identified.

Non-synonymous variants within oxidative phosphorylation genes
Although most variants resulted in synonymous coding, non-synonymous variants in the regions between nucleotide 3307 and 15,887 that code for 13 genes responsible for oxidative phosphorylation were the main focus of analysis as these variants are capable of creating mutations affecting enzyme function. Nonsynonymous variants to the revised Cambridge Reference Sequence were identified using the MSeqDR annotations. When combining the results for all oxidative phosphorylation genes, the frequency of nonsynonymous variants per person was not different between the groups with high and normal lactate levels ( Fig. 1, right panel). Each of the oxidative phosphorylation genes was then analysed separately (see Table 4).
Although the frequency of non-synonymous variants per person per 1000 nucleotides was slightly higher in the high lactate group for nearly every gene, the differences were not statistically significant. In both patient groups, non-synonymous variants per patient per 1000 nucleotides were found less frequently in ND4L/4 and were found more frequently in ATP8/6, ND3, and Cyto-b compared with the other oxidative phosphorylation genes. The number of individuals with one or more non-synonymous variants in each of the oxidative phosphorylation genes was also not different between the two groups.  Non-synonymous variants when present were seen in as few as 0.3% to as many as 100% of individuals depending on the nucleotide position. The distribution of individual non-synonymous variants in the study population is shown in Fig. 2.

Non-synonymous variants predicted to represent a disruptive variant
Because non-synonymous variants are predicted to result in different degrees of disruption of the final protein, the analysis further focused on those variants predicted to be most disruptive to the oxidative phosphorylation polypeptides. Four different software applications designed to predict disruptive variants were used: MToolBox [8,9], PolyPhen2 [10], SIFT [11], and Mutation Taster [12]. Of the 100 different non-synonymous variants found within genes coding for oxidative phosphorylation, the 15 variants with the highest MToolBox disruption scores are shown in Table 5. Half of these occurred within genes coding for NADH dehydrogenase subunits (ND genes). When the data for the ND genes were combined in post hoc analysis, the frequency of disruptive variants at the

Discussion
Plasmodium falciparum malaria is a highly lethal and common disease of children worldwide. Lactic acidosis is a well-established complication of malaria infection and is associated with an increased risk for fatal outcome [6]. While lactic acidosis is present in approximately half of those with SMA, individual patients exhibit different levels of lactic acidosis despite the same degree of anaemia and many patients tolerate severe anaemia without developing elevated blood lactate levels [3,5,6]. The extent to which such differences are related to mitochondrial gene polymorphisms is unknown and was the focus of this study.
Malaria exerts a strong evolutionary selection pressure and is known to be associated with a large number of nuclear gene adaptations [4]. Although malaria infection has been associated with host mitochondrial pathology [13], previous studies exploring an association between severe malaria syndromes and human mitochondria DNA sequence variations were not identified. In this study, mitochondrial gene sequences between a cohort of children with or without lactic acidosis in the setting of SMA were compared for the first time. The two patient groups were well matched for ethnic background, and patients lived in close proximity to the hospital. The majority of mothers belonged to just three similar tribal affiliations. Mitochondrial haplotype analysis was balanced in the two groups and reflected haplogroups expected in an East African population [14,15]. Given the genetic heritage of the participants, a large number of synonymous and non-synonymous genetic variants were found when compared with the Cambridge Reference Sequence. These variants occurred throughout the mitochondrial genome.
The hypothesis of this study was that mitochondrial gene sequences might differ between those individuals who developed high blood lactate levels at the time of severe anaemia and those who did not. In particular, mitochondrial genes directly involved in oxidative phosphorylation were analyzed. Although the comparisons did not reach statistical significance for the sample size tested, the average number of non-synonymous variants per individual, the number of individuals with non-synonymous variants, and the occurrence of variants predicted to cause protein disruption were all higher in the group with elevated blood lactate levels. In particular, non-synonymous variants predictive to disrupt at least one member of the NADH dehydrogenase gene cluster were twice as common in children with high lactates, a finding that did not quite reach statistical significance (p = 0.052). Future studies with a larger sample size are needed to verify whether or not NADH dehydrogenase mutations might contribute to intolerance to severe anaemia.
The findings of this study suggest that the determinants of high blood lactate levels in patients with SMA may be related either to gene polymorphisms outside the mitochondrial genome or to other factors. Because the great majority of mitochondrial proteins are encoded by nuclear genes [16], variations in those genes might influence mitochondrial function. Alternatively, variation in other nuclear genes such as those of the glycolytic pathway, the Krebs cycle, or those affecting NAD+/NADH levels may influence blood lactate levels during severe anaemia. In addition, lactic acidosis in SMA may also be related to factors unrelated to host genetics such as microvascular physiology, nutritional status, or duration of illness. Understanding the fundamental drivers for life-threatening lactic acidosis in malaria awaits further research. This study has both strengths and weaknesses. The comparison groups were homogeneous and well matched for clinical features including age, gender, and the degree of anaemia and haemoglobin oxygen saturation. In addition, all subjects were documented to be negative for the sickle haemoglobin gene which would be the most likely confounding variable to affect comparisons. The method of sample preparation avoided confounding from parasite mitochondrial DNA. However, the two groups were not compared for variations in nuclear genes that affect mitochondrial function, nor were the groups compared for the number of mitochondria present in tissues expected to produce the greatest amount of lactate. In addition, the number of individuals tested was insufficient to detect low frequency differences in mitochondrial gene sequences between the comparison groups.

Conclusion
In this prospective case-control investigation of 73 Ugandan children with severe anaemia and P. falciparum malaria, mitochondria sequence variations were observed to occur in approximately 4/1000 nucleotides compared with the Cambridge Reference Sequence. However, within the constraints of the sample size tested, no statistically significant difference in mitochondrial DNA sequence variation was observed when comparing those with and without lactic acidosis. Increased blood lactate, an important marker for decreased survival in SMA, does not appear to be strongly associated with a high prevalence host mitochondrial DNA sequence variation.
Authors' contributions CF performed DNA sequencing and sequence analysis and participated in manuscript preparation. CC-G participated in study design, funding, clinical data collection, and manuscript preparation. AD enrolled patients, collected samples and clinical data, and participated in manuscript preparation.CM enrolled patients, collected samples and clinical data, and participated in manuscript preparation. HS was responsible for statistical review. SA participated in study design, DNA testing and analysis, and manuscript preparation. WD participated in study design, funding, clinical data collection, data analysis, and manuscript preparation. All authors read and approved the final manuscript.