Videogame and crowdsourcing architecture
A videogame and a backend architecture with servers and databases on the cloud that allowed to run the experiments in real time was developed, showing new image samples to the players and collecting all the information that they produced in real time. After multiple in-house tests, MalariaSpot Bubbles was made available to the public on April 25, 2016 (World Malaria Day). In July 2016, more than 25,000 people from 121 different countries around the world had played MalariaSpot Bubbles. In this period, gamers played a total of 596,235 puzzles over the 114 images samples tested in the game and generating a database of more than half a million species classification decisions.
Players’ behaviour
Out of a total of 596,235 decisions, 447,176 (75%) tagged the correct species. Gamers were able to identify the species correctly most of the times in all the levels, but performance decreased with the number of options (level) as expected (Fig. 2a, b). The digital nature of the classification task allowed to quantify the time from the moment that the image of the infected blood sample is shown to the moment when the decision is taken. The mean time to decide the malaria species shown in the image was 2.09 s and depended on the level of difficulty itself, ranging from 1.96 s for 2 species to 2.33 s if the decision was between 5 possible species (Fig. 2d). Interestingly, gamers spent less time when they choose the correct species in all the levels analysed (Fig. 2c). The percentage of correct classifications increased as players performed more classifications (Additional file 2: Figure S2a) and they spent less time (Additional file 2: Figure S2b). The percentage of success also increased when players have beaten the four levels of difficulty (Additional file 2: Figure S2c).
Species differentiation
Gamers accurately differentiated the five malaria species (P ≤ 0.0001). It is important to note that the probability of success was higher at level 1 (when only P. falciparum and P. vivax were shown) than at level 4 (when all the species were shown at the same time). Results were analysed independently for every level (Fig. 3).
At level 1, 116,080 decisions were obtained for questions related to P. falciparum and 116,080 answers for P. vivax. Both species were significantly differentiated by gamers (P ≤ 0.0001). The percentages of correct answers were 79% and 83% for P. falciparum and P. vivax, respectively. The specificity of this level was 99.9% for both species.
At level 2, the percentage of success for the new introduced species, P. ovale, reached 79%. Nevertheless, the number of hits for P. vivax decreased to 66% (which was mistaken with P. ovale in 23% of the cases). The successful identification of P. falciparum was similar to level 1 (82% of correct answers). In this level, 163,661 decisions were registered: 54,358 for P. falciparum images, 54,542 for P. vivax and 54,761 for P. ovale. Again, the three species were correctly differentiated by gamers (P ≤ 0.0001) and a specificity of diagnosis of 99.9% was obtained.
At level 3, users were still capable of significantly distinguishing the four species shown (P ≤ 0.0001). A specificity of diagnosis of 99.9% for P. falciparum, P. vivax and P. ovale and of 90% for P. malariae was achieved. The level of success for the three species which had been already introduced was similar to the observed at level 2 (80% of hits for P. falciparum with a total of 29,185 clicks, 71% for P. vivax with a total of 29,392 decisions and 62% for P. ovale with a total of 29,596 clicks). For the new species, P. malariae, a total of 29,920 answers were registered, with a 62% success rate.
At the last and most difficult level, gamers continued differentiating the five species shown (P ≤ 0.0001). The same level of hits was obtained for P. falciparum (81% for a total of 16,344 decisions); P. ovale (69% for a total of 16,689); P. vivax (61% for a total of 16,612) and P. malariae (57% for a total of 16,562). The most challenging species to differentiate at level 4 was P. knowlesi, with a 45% of hits over 16,490 decisions. This newly introduced species was mostly mistaken with P. malariae (in a 29% of the cases), although both species were significantly differentiated (P ≤ 0.0001). The level of specificity of this level was 99.9% for P. falciparum, P. vivax and P. ovale, 90% for P. malariae and 81% for P. knowlesi.
Collaborative species classification
Finally, the minimum number of on-line analysts over the same sample that would be needed to obtain an accurate diagnosis in a hypothetical real time system was evaluated. For this, 30 simulations of the analysis of each of the images samples for group sizes from 2 to 40 gamers in each of the levels were performed. In each simulation, the individual decisions of a random group of players of a certain size are combined into a collective decision by a voting algorithm that chooses the species which obtains more votes (Fig. 4). For level 1 (P. falciparum and P. vivax), the probability of correct species classification reached 99.9% when the answers of a minimum of 14 gamers were combined. For level 2 (P. falciparum, P. vivax and P. ovale), it reached 99.9% when the group of gamers had a minimum size of 15. When 4 species where compared (level 3: P. falciparum, P. vivax, P. ovale and P. malariae), the maximum probability reached was 99%. This probability was obtained when the size of the group was composed of at least of 25 gamers. For the most difficult level (level 4: P. falciparum, P. vivax, P. ovale, P. malariae and P. knowlesi), the maximum probability reached was 80%, obtained when combining the answers of at least 17 gamers.