Drug resistance to current anti-parasitic compounds has become widespread and is on the increase, including resistance to newer treatments such as artemisinin [1, 2]. While in-depth studies are ongoing on a relatively small number of selected putative targets for future exploitation, not many resources are available that focus on performing data mining and target identification on the complete malaria genome, in concert with relations to chemical compounds. Currently available resources that may be useful for target identification include PlasmoDB , TDR Targets , PlasmoMap , the Tropical Diseases Kernel  and also the original version of Discovery .
Recent approaches have illustrated the value of predicting the association of chemical compounds with putative protein drug targets, especially when the targets of compounds such as the GSK dataset with known activity against the parasite may be extrapolated using protein-ligand interaction databases such as ChemProt [8, 9]. The Discovery resource attempts to use a similar approach in associating chemical compounds with malaria proteins using sequence homology, and also selective chemical similarity searches. While the resource is focused primarily at Plasmodium falciparum, it contains information for all proteins from Plasmodium vivax, Plasmodium yoelii, Plasmodium knowlesi, Plasmodium chabaudi and Plasmodium berghei and also for the human and mosquito hosts. Protein information includes sequences and annotations from PlasmoDB, Ensembl and VectorBase, functional predictions, gene ontology terms, orthology information, structural information, metabolic pathways, predicted putative protein-ligand interactions, druggability predictions, and literature links. The resource also contains chemical compounds from the ChEMBL database with chemical search functionality and putative ligand-protein prediction information. Protein searches may be performed using accession numbers, keywords or an advanced multi-parameter filtering interface. Chemical searches may be performed using keywords, SMILES strings or chemical structures.
This new implementation of Discovery (version 2) is a complete rewrite of the original Python-based system using Java and NetBeans with the implementation of automated updates based largely on web services. The previous version included information related to sequence features, orthology, ontology terms, structural information, metabolic pathways and protein ligand interactions. The new functionalities are the addition of expression information, literature information from PubMed abstracts, the implementation of druggability predictions from DrugEBIlity where available, the inclusion of malaria-related data from the clinical trials database and the implementation of Wiki-like user annotations. Specific advantages of Discovery-2 include extensive functionality to identify putative associations of proteins with ligands, an advanced chemical structure search interface encompassing the content of the ChEMBL database and an interactive system for refining putative target selections based on data mining of molecular properties. Additionally, the resource also contains data for the mosquito and human hosts, for easy comparative analysis.