The experimental setup was designed to assemble a collection of paired BF and DF images for various objects, including parasites (in BF), haemozoin objects (in DF), and non-haemozoin objects (in DF). This collection of paired images allowed staging of parasites via BF images, and provided a well-labeled set of DF objects for development of an automated haemozoin detection algorithm.
Blood smears from febrile patients were provided by Shoklo Malaria Research Unit in Thailand. The samples had been tested for malaria using three methods: rapid diagnostic tests (SD Bioline Malaria Ag P.f/Pan), Giemsa-stained microscopy, and polymerase chain reaction (PCR). The present experiment used only P. falciparum-positive samples, with no co-infection, and Plasmodium-negative samples (as confirmed by all three diagnostic methods). A total of 32,000 red blood cells (RBCs) from 23 samples (10 positive, 13 negative) were imaged. From this RBC sample set, 974 objects were found in DF and 160 parasites were found in BF after staining. The positive samples had a relatively high parasitaemia (as determined by microscopy), ranging from 43,000 to 290,000 parasites/μl, or about 8.6 to 58 parasites per 1,000 RBCs (assuming 5 × 106).
Unstained thin smears, fixed in methanol, were first imaged using DF microscopy, then Giemsa-stained and imaged using BF microscopy. These images were then aligned with software and used to determine whether a given object observed under DF corresponded to a parasite and, conversely, whether a given parasite generated a detectable DF object. Imaging Giemsa-stained smears directly with DF microscopy is unworkable, as the Giemsa stain itself scatters light efficiently and thus shows up under DF [18].
All images were obtained using a Malvern Morphologi G2 microscope with a Baumer FWx20c colour camera (Baumer, Ltd., Southington, CT, USA). For each positive sample, about 1,000 RBCs were imaged. This is a sufficient number of cells to accurately reflect the profile of the parasite load, given the high parasitaemia of these samples and assuming a uniform distribution of parasites throughout the blood sample.
Unstained smears were fixed in methanol and then coated in a very thin layer of microscope immersion oil and imaged without coverslip under epi-illumination DF using a 50× Nikon 0.55 NA objective (Nikon Instruments, Inc., Melville, NY, USA). Nine overlapping 50× images were acquired. Immediately after DF imaging, the oil was removed by re-immersing in methanol and the smears were stained in a 10% solution of Giemsa (Sigma-Aldrich Corporation, St. Louis, MO, USA) in deionized water for 10 minutes. Forty-two overlapping BF images were obtained using an oil-immersion 100× Nikon 1.25 NA objective (Nikon Instruments, Inc., Melville, NY, USA), covering the DF 50× fields of view. For each negative sample, sixteen 50× DF images consisting of approximately 1,700 total blood cells were acquired. The CCD camera must be set to avoid pixel saturation in DF images, because saturation adversely affects color detection by the automated algorithm (see Additional file 1 for details).
The unstained DF images and stained BF images were registered using alignment algorithms. Briefly, a Hough transform circle detection algorithm was used to find the centers of the RBCs in every image (both BF and DF). Correlation filter methods were then used on the (appropriately scaled) RBC center locations to calculate the offsets between the various images.
Every object of interest (e.g. a bright spot in DF or a parasite in BF) was thus represented by matching thumbnail images in both BF and DF modalities. In particular, each bright spot detected in the 50× DF images was linked to a stained BF 100× thumbnail (see for example Figure 1), while each parasite detected in the stained image was linked to a 50× DF thumbnail. This pairing of DF and BF thumbnails allowed construction of an accurate truth table: Each DF bright spot was labeled as positive (associated with a parasite in BF) or negative (no parasite); and each parasite was labeled as either associated with a DF bright spot or not. Giemsa-stain reading was performed by eye on the 100× BF images and parasites were staged according to Silamut et al.[22], using the template shown in Figure 2.
Dataset statistics
A fully automated image-processing algorithm was applied to a labelled dataset of DF objects. This dataset consisted of 974 thumbnail DF images of objects of interest, drawn from DF 50× images obtained as described above. The objects were “bright spots”, selected by applying a threshold filter to a grayscale image of each sample. Due to variations in the brightness of the cell membrane and background, the threshold for each sample was adaptively chosen to be higher than the brightness of cell information due to haemoglobin scattering. The adaptive threshold ensured that most bright spots (i.e. spots with a signal stronger than the cell wall signal) were captured, while limiting the number of negative objects.
The dataset contains 51 haemozoin objects and 923 non-haemozoin objects (i.e. objects with no corresponding parasite in its BF image). Figure 1A and B show examples of both haemozoin and non-haemozoin objects imaged in DF. The non-haemozoin objects include bubbles, dust particles, and other unidentified artifacts. The haemozoin objects correspond to 41 mid-late rings (> 6 hours), and 10 trophozoites and schizonts. Although some of the blood samples contain many early rings (0–6 hrs), these lack sufficient haemozoin to be detected (see Results) and thus did not have corresponding haemozoin objects.
The BF 100× images contained 160 parasites, including 150 rings. Of these rings, 41 had an associated DF 50× bright spot as described above. 109 rings had no associated DF 50× bright spot, and were thus invisible to the DF detection system.
Algorithm outline
The purpose of the experiment was to assess the value of haemozoin as a biomarker in two ways: first, by determining if haemozoin can be distinguished from false positives by an automated image-processing algorithm; and second, by characterizing the abundance of haemozoin in early ring-stage parasites. The goal of the image-processing algorithm was to classify bright spots seen in DF as haemozoin or non-haemozoin. The algorithm is fully described in Additional file 1 and only briefly described here. The algorithm was developed using supervised learning on the fully labelled training set. It exploits both the shape and colour of the haemozoin signal in DF. The time required to image 1,000 cells and process the data with the complete algorithm is ~10 s, although this has not yet been optimized for speed.
Shape
Haemozoin is deposited irregularly in the parasite, and because non-haemozoin objects are typically radially symmetric (i.e. close to circular), one branch of the algorithm measures how radially symmetric an object is. Haemozoin is typically asymmetric.
Colour
Haemozoin scatters predominantly blue light due to its size and dielectric properties [19], while non-haemozoin objects scatter more broad-band light. The second part of the algorithm maps object colour into HSV space (Hue-Saturation-Intensity, an alternative colour basis to RGB). Haemozoin objects cluster closely together in the saturated blue region of HSV colour space.
This algorithm ignores gametocytes, for two reasons. First, gametocytes’ haemozoin signature is not amenable to detection by the algorithm’s particular colour analysis (see Additional file 1 for details). Second, gametocytes are very distinctive in DF (see Figure 1, row 2, column 5) and are thus easily detectable by other means.