Skip to main content

Advertisement

Table 1 How malaria datasets are simulated

From: Markov chain Monte Carlo and expectation maximization approaches for estimation of haplotype frequencies for multiply infected human blood samples

Patient # MOI BIOMASS f.BIOMASS msp1 msp2 ta109 Haplotype Observed MOI Observed genotype
1 1 5.29E+10 1.000 10 34 3 112 1 112
2 3 8.06E+09 0.100 24 23 5 112 1* 111*
   6.48E+10 0.803 20 6 5 111   
   7.86E+09 0.097 16 27 5 112   
3 2 5.06E+10 0.474 24 35 3 111 2 111
   5.62E+10 0.526 1 34 4 111   
4 2 5.52E+10 0.487 21 34 4 122 2 133
   5.81E+10 0.513 18 33 4 111   
5 3 3.16E+10 0.432 23 32 9 111 2* 133*
   1.35E+09 0.018 21 28 7 112   
   4.03E+10 0.550 23 27 9 122   
  1. The ‘population’ frequencies of different MOI classes, polymorphic markers (msp1, msp2, ta109) and resistance haplotypes in the local malaria population are first defined. A number of patients are then simulated, five in this case but more usually 100. For each patient a MOI is first sampled according to the local “population” frequencies (which will depend on local transmission intensity). This MOI then determines the number of malaria clones in the patient. These clones are then simulated. The first step is to assign a biomass to the clone. The clone polymorphic markers are assigned at random according to the local true frequencies. Finally a resistance haplotype is assigned to the clone, again sampled from the local true frequencies. This process is repeated for each clone in each patient and gives rise to the data given in black font below. The genetic signal observed in each patient (last two columns) is then calculated as described in the main text. In this example, genetic signals are not detected if they constitute ≤10 % of the biomass (f.BIOMASS gives relative biomass for each clone in a patient). What is actually observed, and available for analysis, is the information given in italics; genotyping limits produce errors and those erroneous data are indicated by a asterisk: they are the data available to the researcher but do not truly reflect the genetic data of the parasites in that patient
  2. Haplotype is the resistance haplotype for each clone. It is defined at three SNPs, for each clone: 1 = wildtype, 2 = mutat. Observed genotype is observed genotype for each patient. It is defined at three SNPs; for each SNP: 1 = wildtype alone, 2 = mutant alone, 3 = both wildtype and mutant genetic signals observed in the blood sample