Skip to main content
Fig. 1 | Malaria Journal

Fig. 1

From: An optimized GATK4 pipeline for Plasmodium falciparum whole genome sequencing variant calling and analysis

Fig. 1

Performance of the optimized GATK4, default GATK4 and GATK3 pipelines. A Pipeline performance using current high-quality Illumina read data (read length = 250 bp; insert size = 405–524 bp) from single infection samples. Ten laboratory strains (7G8, Dd2, GA01, GB4,GN01, HB3, IT, KH01, KH02 and SN01) were included for all the pipelines except GATK3 as only two (GN01 and KH02) of these samples were found in the available GATK3 VCFs on the MalariaGEN website. B Pipeline performance on simulated high-quality mixed infections samples of IT + KH01 at 95:5, 90:10, 85:15, 80:20, 75:25, and 50:50 proportions (100× read depth). Only significant statistical differences obtained with the Wilcoxon test are shown (indicated by asterisks). Pipeline 1: GATK4 pipeline with default settings of HaplotypeCaller and GenotypeVCFs coupled with variant recalibration by the in silico training dataset. Pipeline2: fully optimized GATK4 pipeline with alternation of HaplotypeCaller and GenotypeGVFs parameters and variant recalibration (filtering) using the new in silico training dataset. Default GATK4 (crosses): Default GATK4 pipeline but recalibrated by the publicly available cross dataset. GATK3: same GATK3 pipeline used by MalariaGEN’s Pf6 release in which variants are recalibrated by the cross training dataset. Red dashed line represents 90% performance

Back to article page