Evaluation of three parasite lactate dehydrogenase-based rapid diagnostic tests for the diagnosis of falciparum and vivax malaria

Background In areas where non-falciparum malaria is common rapid diagnostic tests (RDTs) capable of distinguishing malaria species reliably are needed. Such tests are often based on the detection of parasite lactate dehydrogenase (pLDH). Methods In Dawei, southern Myanmar, three pLDH based RDTs (CareStart™ Malaria pLDH (Pan), CareStart™ Malaria pLDH (Pan, Pf) and OptiMAL-IT®)were evaluated in patients presenting with clinically suspected malaria. Each RDT was read independently by two readers. A subset of patients with microscopically confirmed malaria had their RDTs repeated on days 2, 7 and then weekly until negative. At the end of the study, samples of study batches were sent for heat stability testing. Results Between August and November 2007, 1004 patients aged between 1 and 93 years were enrolled in the study. Slide microscopy (the reference standard) diagnosed 213 Plasmodium vivax (Pv) monoinfections, 98 Plasmodium falciparum (Pf) mono-infections and no malaria in 650 cases. The sensitivities (sens) and specificities (spec), of the RDTs for the detection of malaria were- CareStart Malaria™ pLDH (Pan) test: sens 89.1% [CI95 84.2-92.6], spec 97.6% [CI95 96.5-98.4] OptiMal-IT®: Pf+/- other species detection: sens 95.2% [CI95 87.5-98.2], spec 94.7% [CI95 93.3-95.8]; non-Pf detection alone: sens 89.6% [CI95 83.6-93.6], spec 96.5% [CI95 94.8-97.7] CareStart Malaria™ pLDH (Pan, Pf): Pf+/- other species: sens 93.5% [CI9585.4-97.3], spec 97.4% [95.9-98.3]; non-Pf: sens 78.5% [CI9571.1-84.4], spec 97.8% [CI95 96.3-98.7] Inter-observer agreement was excellent for all tests (kappa > 0.9). The median time for the RDTs to become negative was two days for the CareStart™ Malaria tests and seven days for OptiMAL-IT®. Tests were heat stable up to 90 days except for OptiMAL-IT® (Pf specific pLDH stable to day 20 at 35°C). Conclusion None of the pLDH-based RDTs evaluated was able to detect non-falciparum malaria with high sensitivity, particularly at low parasitaemias. OptiMAL-IT® performed best overall and would perform best in an area of high malaria prevalence among screened fever cases. However, heat stability was unacceptable and the number of steps to perform this test is a significant drawback in the field. A reliable, heat-stable, highly sensitive RDT, capable of diagnosing all Plasmodium species has yet to be identified.


Background
Malaria is one of the few diseases for which it is quick and simple to make an accurate biological diagnosis, even in a low-technology setting. Despite this clinical diagnosis is practised widely, even though it has been shown repeatedly to be unreliable [1,2]. In a cross-over study in Zanzibar of 1,887 patients, use of RDTs altered prescribing patterns of antimalarials and anti-bacterials and resulted in improved patient management without increasing costs [3]. Availability of biological diagnosis does not necessarily prevent over-treatment from occurring. In some areas, patients are still likely to receive an antimalarial treatment in the presence of a negative slide or RDT result [4,5].
The choice of diagnostic method in most of the malariaaffected world will be between microscopy and a rapid diagnostic test. Maintaining a high standard of microscopy is challenging and depends on having well-trained experienced technicians, who are not overburdened with slides to read, a continuous supply of good quality staining reagents and appropriately maintained microscopes.
RDTs for malaria are based on the detection of either histidine-rich protein 2 (HRP-2), produced only by Plasmodium falciparum, parasite specific lactate dehydrogenase (pLDH) produced by all four species or plasmodium aldolase from the parasite glycolytic pathway, also found in all species. HRP-2 based tests may be misleading in areas of high transmission because they remain positive for a number of days or weeks after an infection, even if treated, thus a positive result with a history of a recently treated infection is difficult to interpret. Another limitation of HRP-2 based tests is their geographically variable sensitivity, attributed to polymorphisms in HRP-2 [6]. Tests based on detection of pLDH or aldolase allow parasite speciation, do not appear to show geographical variability in their ability to detect malaria and revert to negative more quickly than HRP2 based tests, although production of pLDH from gametocytes after elimination of asexual stages means some will stay positive for a number of days [7]. However, to date the sensitivity of these tests under field conditions has been reported frequently as falling below 90% [8,9]. There are concerns about the stability of all types of tests if transportation and storage conditions are not controlled, but pLDH tests appear to be particularly vulnerable [10].
In areas where mixed species infections are common e.g. Asia, Latin America, a reliable test to distinguish between species is needed, since the treatments recommended for falciparum and non-falciparum infections are different. In most parts of the world non-falciparum infections remain susceptible to chloroquine, although chloroquine-resistant vivax malaria has emerged in parts of Indonesia and South America [11,12].
To be adopted in the field a test needs to be >95% sensitive and specific for the detection of falciparum malaria at a parasitaemia ≥ 100/μl. High sensitivity and specificity for the diagnosis of non-falciparum malaria are desirable; however there is no internationally agreed threshold.

Criteria used to select the tests under evaluation in this study
The CareStart™ Malaria pLDH (Pan) test, a two-line test, shown to be reliable for the diagnosis of falciparum malaria was selected to validate its use in an area where non-falciparum infections were prevalent [13]. The WHO does not endorse particular tests but lists products available for procurement, which meet certain criteria e.g. manufactured to good manufacturing practice standards [14]. Since the time this study was performed, in order to be included in the procurement list, products must now also have been volunteered for the WHO product testing programme. The CareStart™ Malaria pLDH (Pf, Pan) threeline test was included in this study in the hope that the good performance of the two-line test might be repro-duced with this test, and OptiMAL-IT ® was also included, since published results suggested it was one of the most sensitive RDTs for diagnosing non-falciparum species [15].

Aims
The main aims of the study were to evaluate the sensitivity, specificity, positive and negative predictive values of the three RDTs for the detection of vivax and falciparum malaria compared to microscopy of Giemsa-stained blood slides, to study the time taken for positive tests to become negative, and to evaluate the inter-observer agreement, ease of use of the tests and heat stability.
Subsidiary aims included investigating differences in sensitivity of the tests in the under five year age group and the effect of parasitaemia or other covariates on sensitivity. In addition, sensitive PCR species detection was evaluated as an alternative reference standard in a subset of patients, which included all patients with positive slides, all patients with false positive RDT results compared to microscopy, and 20% of patients with negative slides selected at random.

Study site and population
The study took place in a clinic where the non-governmental medical Organisation MSF-Switzerland works with the agreement of the Ministry of Health in Sonsinphya, Thayetchaung Township, Dawei in the south of Myanmar. The four Plasmodium species (falciparum, vivax, ovale and malariae), which commonly affect humans, are found here where malaria has a seasonal incidence, with a peak in the rainy season between June and August. The vast majority of cases (>80%) occur in patients over 5 years old.

Ethical review
The study protocol was approved by the Comité de Protection des Personnes (CPP), Ile-de-France XI, St-Germainen-Laye, France and local authorities in Myanmar gave their permission for the study to be implemented.

Study design
A prospective, single-blind evaluation of three RDTs compared to slide microscopy.

Sample size
Based on local 2006 data, when prevalences of Pf and Pv among fever cases were 30% and 10% respectively, it was estimated that 460 patients would be required to detect Pf with sensitivity and specificity of 90% (alpha error 0.05, precision 5%) and 960 patients would be required to detect Pv with the same sensitivity and specificity but a precision of 6% (N = 1383 with precision 5%).
The sample size was set at 1,000. The study was not powered to evaluate detection of mixed infections. A convenient sample size of 120 patients with a positive RDT result and positive malaria slide (60 Pf, 60 Pv) was chosen to describe the time taken for the RDTs to become negative.

Informed consent
Patients were provided with an information sheet and the study was explained in their own language by the study personnel (Burmese or Karen). Written consent was obtained from participants or parent/guardian in the case of children.

Inclusion and exclusion criteria
Main inclusion criteria were age> two years, with suspected malaria defined as fever (tympanic tempera-ture>37.5°C), or a history of fever in the previous 48 hours, no signs of severity/danger signs. Exclusion criteria were pregnancy or having received a treatment course of antimalarials in the previous 4 weeks. To be eligible for inclusion into the follow-up study subjects needed to have a positive malaria slide for Pf or Pv mono-infection in conjunction with a positive RDT and to be able to attend follow-up until 28 days.
The terms 'two-line' and 'three-line' are used to facilitate distinction between the two types of CareStart™ Malaria tests evaluated. The result of the CareStart™ Malaria twoline pLDH (Pan) was recorded as negative, positive or invalid. The CareStart™ Malaria three-line pLDH (Pan, Pf) and OptiMAL-IT ® test results were recorded as negative, Pf (+/-P.other), non-Pf or invalid. It should be noted that interpretation of the CareStart™ Malaria pLDH (Pan, Pf) test differs from OptiMAL-IT ® in that a test with a positive control line, positive Pf pLDH band and negative pLDH (Pan) should be interpreted as Pf. For OptiMAL-IT ® both Pf and pan pLDH bands must be positive to interpret the test as Pf.

Blinding
Tests were labelled with the patient code on the underside. Two laboratory technicians performed the tests, took the capillary blood sample for haematocrit, the sample onto filter paper for PCR analysis and prepared the malaria blood slide. The slide was passed to another lab-oratory technician to stain and read, who was unaware of the RDT results. One test was handed to each of three readers for interpretation at the appropriate time. The tests were then handed on to three different readers, who were unaware of the first interpretation, to read and interpret the tests 10 minutes later. For the duration of the study each of the six readers read the same type of test and made the same reading each time (i.e. first or second).

Visit schedule
All patients were seen on day 0 and the first 120 patients who agreed (60 with slide confirmed Pf mono-infection, 60 with Pv) were asked to return on days 2, 7, 14, 21 and 28 to document when the tests became negative. A malaria slide was performed simultaneously. Once the test was negative on any follow-up day they did not need to return for subsequent visits.

Laboratory procedures
RDTs were performed according to the manufacturers' instructions. For all tests, a test without a control line was considered invalid and repeated. The number of invalid tests was recorded. Blood films were stained with 10% Giemsa for 20 minutes and read by experienced technicians. Parasite stages were counted separately by species. Trophozoites were counted on the thick (if count <500/ 500 WBC) or on the thin smear. At least 200 high power fields were examined before a slide was declared negative.

PCR species sensitive detection
Sensitive species PCR genotyping on blood samples on filter paper was performed by the Shoklo Malaria Research Unit (SMRU) blind to the RDT and microscopy results. Parasite DNA was extracted from the bloodspot using the saponin lysis/chelex extraction method developed by Wooden and colleagues in 1993 [16]. DNA was stored at -20°C until processed. The targets of the Plasmodium-species PCR amplification are the genes coding for the small subunit ribosomal RNA (ssRNA). In the first amplification reaction (Nest 1), genus-specific primers were used to amplify a fragment of the ssrRNA genes of any Plasmodium parasite. The product of the first reaction was then used as the DNA template for a second amplification reaction (Nest 2). Species-specific primer pairs in the Nest 2 round amplified the specific sequence for P. falciparum, P. vivax, P. malariae or P. ovale. Primers and amplification conditions were those described by Snounou et al [17].

Treatment
Patients with vivax malaria were treated with chloroquine 25 mg/kg divided over three days (10+10+5). Patients with falciparum malaria or mixed falciparum/vivax infections with a falciparum parasitaemia of < 4% infected red blood cells (irbc) were treated with mefloquine (25 mg/ kg) + artesunate (4 mg/kg/day for three days)(MAS3). If the falciparum parasitaemia was ≥ 4% irbc they were treated with a seven-day regimen of artesunate (4 mg/kg for one day followed by 2 mg/kg/day for six days) and mefloquine (25 mg/kg) (MAS7). Those patients asked to come back for repeat testing to evaluate when the tests became negative had all doses of treatment supervised.

Assessment of ease of use of tests
Ease of use of the tests was assessed using qualitative and quantitative criteria-Qualitative-laboratory technicians performing the tests were asked to rank the tests in order of preference where 1 corresponded to their most preferred test and 3 to their least preferred in each of the following categories -ease of taking blood, ease of adding reagents, ease of interpretation and overall performance.
Quantitative -number of steps in the procedure, time to wait before reading test.

Quality assurance and quality control
The laboratory was already implementing internal quality control (QC) checks monthly and slides were sent periodically to an external laboratory (SMRU). In this study all slides were double-read, blind, by experienced technicians, with approximately seven days between the first and second readings. Slides with discordant results between two microscopists, defined as positive/negative discordance for asexual stages; species discordance for asexual stages; asexual parasite density discordance (difference in parasitaemia ≥ 50%) and gametocyte or malaria pigment reporting discordance were sent to SMRU for a third blind reading. The third reading was taken as the definitive result.
A sample of 120 of each of the CareStart Malaria™ tests (one box contains 60 tests) and 96 OptiMAL-IT ® tests (one box contains 24 tests) were sent to the Malaria RDT Quality Assurance Laboratory, Research Institute for Tropical Medicine, Muntinlupa City, Manila, Philippines for heat stability testing and quality control of performance according to Standard Operating Procedures (SOPs) under development by the WHO, Research Institute for Tropical Medicine, and others as part of a joint initiative to develop laboratory methods for malaria RDT assessment. Stability of the test result at one week in the field was also evaluated with a third blind reading made one week after the first to see whether retrospective checking of RDTs could be introduced as part of a regular QC procedure.

Data management and analysis
Data from the case report forms was checked and entered on site in Microsoft ® Access. The source data forms with the RDT results were entered separately into Microsoft ® Excel. The two databases of RDT results were cross-checked. All discrepancies were corrected by returning to the source documents. Statistical analyses were performed using SPSS ® version 14.0 (Chicago, USA). Sensitivity, specificity, positive and negative predictive values of each test were calculated using microscopy as the reference standard. Sensitivity of the RDTs for detection of Pf and Pv mono-infections were then calculated. The first reading made of the RDT was used for these calculations. Subgroup analyses were performed to calculate the sensitivity and specificity of the tests at a parasitaemia below 100/μL and in the under five years age group. Multivariate analysis (logistic regression) was used to explore the relationship of selected covariates on test sensitivity and specificity (age, haematocrit, parasitaemia). Sensitivity and specificity of the tests and of slide microscopy compared to species PCR genotyping were also evaluated in a subset of patients. The confidence intervals (Wilson method) were calculated using Confidence Interval Analysis (CIA) software, version 21.2 ( © Trevor Bryant 2002-2004, University of Southampton, UK). Agreement between the first and second and first and third readings of each RDT was assessed using the kappa coefficient.

Definitions
Slides for which only gametocytes were detected were considered in two different ways, firstly as positive for that species, since gametocytes are known to produce pLDH, and secondly as negative since missing a solitary gametocytaemia is not usually an indication for treatment. Slides reported as showing malaria pigment only by microscopy were considered as negative.

Results
Between 30 th August and 9 th November 2007, 1,004 patients were enrolled in the study out of a total of 1,833 patients attending the clinic with fever ( Figure 1). The target sample size was exceeded in order to include 100 patients with falciparum malaria into the study. Reasons for non-inclusion of screened patients were not documented. Baseline characteristics of the included population are shown in Table 1. No serious adverse events were detected during the study. There were 13 protocol violations. One subject was less than two years of age, the lower limit for inclusion into the study. Eight patients with a parasitaemia ≥ PFT 40/1,000 irbc were treated with MAS3 instead of MAS7. Four patients were recruited into the follow-up phase of the study on the basis of a positive malaria smear result when their RDT result was negative and had to be excluded from the analysis.

Malaria slide results
The results of the slide microscopy on the day of enrolment (adjusted after quality control) are shown in Table  2. The sensitivity, specificity, positive and negative predictive values of the RDTs compared to slide microscopy are shown in Tables 3 and 4.

Effect of parasitaemia and age on sensitivity and specificity of RDTs
Sensitivities of the RDTs for detection of malaria at higher and lower parasitaemia were compared using 100 parasites/μL as the cut-off (  None of the patients with a presenting parasitaemia meeting the local definition of hyperparasitaemia (PFT ≥ 4% irbc) had a false negative RDT result. The number of patients under five years of age with positive slide results were small. Test sensitivities are shown in Table 6. In a multivariate analysis only the association between parasitaemia and RDT result was significant.

Inter-observer agreement
Results for inter-observer agreement between the first and second readings of the tests performed 10 minutes apart and the first and third readings of the tests, performed one week apart are shown in Table 7.

Results of ease of use evaluation
General remarks made were that the Pf band of the Care-Start™ Malaria three-line pLDH (Pan, Pf) test could be very faint making reading more difficult. For some tests the background was completely red, invalidating the test. It was important not to delay before adding the buffer since the tests could dry out quickly and give an invalid result. The information provided with the CareStart™ Malaria tests was unclear, the title of the packet inserts for both tests was the same and the two tests were in identical packaging, causing confusion. The pipettes for the Care-Start™ Malaria tests were provided separately, unlike Opti-MAL-IT ® where each test is pre-packed with the lancet, pipettes, alcohol swab and instructions. The buffers for the two tests were unlabelled. These points gave rise to a perception that the tests were of inferior quality to the  OptiMAL-IT ® . When it was very windy there were more invalid OptiMAL-IT ® tests because they tended to dry out between steps. The OptiMAL-IT ® pipette was found to be more difficult to use requiring twice as much blood as the CareStart™ Malaria tests. It was thought to be the most complicated test to perform, with 10 steps compared to only four for the CareStart™ Malaria tests. The CareStart™ tests were ranked joint first as the easiest to use.

Slide microscopy
From a total of 1,318 slides read, 228 (17%) were sent for a third blind reading. Of these 76 (5.7%) were sent because of differences in species between the first and second readers, 118 (9%) because of >50% variation in parasitaemia between first and second readings and the remainder because of discrepancies in recording of gametocytes or malaria pigment.

Rapid diagnostic test handling and quality control results
In July 2007, tests were sent from the manufacturers to the MSF logistic department in France by air and then onto Yangon. After one week in customs they were stored in a container at 20°C for two weeks. They were then transported to the central laboratory in Dawei, where the temperature varied between 12 and 24°C, and humidity from 62-81% (based on twice daily recordings). Tests were sent out to the field site in batches where the temperature varied between 23 and 31°C, humidity 50-90%.

Sensitive species PCR genotyping results
Sensitive species PCR genotyping was run on 662 enrolment and follow-up specimens. These included all slide or RDT positive samples and 20% of the negative samples selected at random. The results for the sensitivity and specificity of microscopy and the RDTs compared to PCR are shown in Tables 8 and 9. Table 10 provides an overall summary of the different attributes of the tests.

Discussion
In this evaluation the OptiMAL-IT ® test was the most sensitive test for detection of malaria; however, this was at the How solitary gametocytemia was defined had a big impact on all the results, with an important decrease in sensitivity if a slide with Pf or Pv gametocytes, but no trophozoites was counted as positive for Pf or Pv rather than negative.
All three tests were highly sensitive for the detection of falciparum malaria monoinfections. For a parasitaemia > 100/μL all tests were >95% sensitive for the detection of Pf meeting the criteria set out by the WHO. This is extremely important from the point of view of patient safety. There were five missed falciparum infections, but these were all low parasitaemia infections, the highest being 36 Pf trophozoites/500 white blood cells on the thick smear.
Roll Back Malaria has only recently abandoned its recommendation that children under five years of age with fever It is usually assumed that PCR is more sensitive and specific for diagnosis than microscopy. Expert microscopy vs PCR would be expected to have specificity close to 100%   [17]. In this study, specificity for detection of Pv was lower than for Pf. Possible explanations are inaccurate microscopy (despite the QA/QC measures in place), sub-optimal sensitivity of the PCR method or probably a combination of the two. The advantage of using expert microscopy over PCR sensitive species genotyping is that it gives information about parasite stage and parasitaemia. PCR may be a good alternative in areas where expert microscopy cannot be put in place for a study or in travellers returning to nonendemic areas. In an endemic area PCR may detect an infection that has cleared below the limit if microscopic detection or has been treated. Surprisingly, only three subjects (0.3%) admitted to having taken antimalarials before presenting to the clinic.
A large number of patients presenting with fever or history of fever (64.7%) did not have malaria. This low prevalence was reflected in the relatively low PPV of the Opti-MAL-IT ® test of 69.5% [95%CI 62.4-75.8] for falciparumcontaining infections. Malaria is not the only disease for which improved diagnostics are needed at field level. Leptospirosis and rickettsial diseases, such as scrub and murine typhus, are common in SE Asia. Describing the epidemiology of fever in this population would lead to improved case management. Among those who did have malaria, the vivax prevalence was more than double that of falciparum, a reversal of the ratio observed in 2006 in the same area. The explanation for this is unclear since ACT has been available to treat falciparum malaria in this area for some years. However, access may have improved in adjacent areas as a large donation of artemether-lumefantrine was deployed in government clinics across Myanmar in 2007 with a consequent impact on transmission of falciparum malaria.
The OptiMAL-IT ® was highly sensitive for the detection of falciparum and vivax malaria and would perform best in an area of high malaria prevalence among screened fever cases. However heat stability was not acceptable and the number of steps to perform this test is a significant drawback in the field.
The CareStart™ Malaria pLDH (Pan) test may be a good alternative to Paracheck-Pf™ in areas where the predominant species is P. falciparum, particularly if transmission is high since it becomes negative rapidly.
Heat stability remains a major concern for the pLDH tests and stability testing at intervals as part of quality assurance and quality control of malaria diagnostic procedures  is recommended in programmes using pLDH based RDTs. Unless manufacturers provide convincing data on heat stability from different test batches, a rapid diagnostic test should not be recommended.
Choosing a rapid diagnostic test to deploy in the field depends on numerous factors. High sensitivity and specificity for detection of disease are the most important features of a good test; however these become less relevant if the test is not heat-stable in field conditions, if the test is too complicated to perform or if inter-observer agreement is poor. Cost is also a factor, but costs may vary greatly depending on the quantity ordered and a recent increase in the number of tests on the market should lead to prices coming down.
The drawbacks of the HRP-2 tests have been described; however it could be argued that these are outweighed by their high sensitivity for detection of potentially lifethreatening falciparum malaria and their superior heat stability.
The approach used here to select tests at random and evaluate them is expensive and labour intensive. The results of the first round of the WHO product testing have been published recently. In general, highest Pf detection rates were demonstrated by tests targeting HRP2. Batch-tobatch variation was also observed leading to the recommendation to test new lots post purchase and prior to use [18]. This initiative by the WHO to centralize RDT evaluations and perform in vitro QC and stability testing is wel-