Spatial and temporal patterns of malaria incidence in Mozambique
© Zacarias and Andersson; licensee BioMed Central Ltd. 2011
Received: 24 November 2010
Accepted: 13 July 2011
Published: 13 July 2011
Skip to main content
© Zacarias and Andersson; licensee BioMed Central Ltd. 2011
Received: 24 November 2010
Accepted: 13 July 2011
Published: 13 July 2011
The objective of this study is to analyze the spatial and temporal patterns of malaria incidence as to determine the means by which climatic factors such as temperature, rainfall and humidity affect its distribution in Maputo province, Mozambique.
This study presents a model of malaria that evolves in space and time in Maputo province-Mozambique, over a ten years period (1999-2008). The model incorporates malaria cases and their relation to environmental variables. Due to incompleteness of climatic data, a multiple imputation technique is employed. Additionally, the whole province is interpolated through a Gaussian process. This method overcomes the misalignment problem of environmental variables (available at meteorological stations - points) and malaria cases (available as aggregates for every district - area). Markov Chain Monte Carlo (MCMC) methods are used to obtain posterior inference and Deviance Information Criteria (DIC) to perform model comparison.
A Bayesian model with interaction terms was found to be the best fitted model. Malaria incidence was associated to humidity and maximum temperature. Malaria risk increased with maximum temperature over 28°C (relative risk (RR) of 0.0060 and 95% Bayesian credible interval (CI) of 0.00033-0.0095) and humidity (relative risk (RR) of 0.00741 and 95% Bayesian CI 0.005141-0.0093). The results would suggest that additional non-climatic factors including socio-economic status, elevation, etc. also influence malaria transmission in Mozambique.
These results demonstrate the potential of climate predictors particularly, humidity and maximum temperature in explaining malaria incidence risk for the studied period in Maputo province. Smoothed maps obtained as monthly average of malaria incidence allowed to visualize months of initial and peak transmission. They also illustrate a variation on malaria incidence risk that might not be related to climatic factors. However, these factors are still determinant for malaria transmission and intensity in the region.
Malaria is considered one of the most deadly diseases in Mozambique, with around six million cases reported each year . Most of these cases are Plasmodium falciparum [1, 2]. Transmission takes place all year round with a seasonal peak extending from December to April. Many factors affect the dynamics of malaria transmission and infection, ranging from social to natural. Rainfall and temperature can be considered the major natural risk factors affecting the life cycle and mosquito breeding . Relative humidity plays a role in the lifespan of the mosquito. In the presence of high relative humidity values, the parasite would complete the necessary life cycle in order to increase transmission of the infection to more humans. All districts in Maputo province show favourable climatic conditions for development and transmission of malaria . Studies on prevalence of malaria are important not only to assess the problem of malaria in a given region, but also to analyse the effectiveness of strategies for primary and secondary prevention, as well as its quality and impact.
A combination of advances in hierarchical modelling and geographical information systems has led to the developments in fields of geographical epidemiology and public health surveillance. This made it possible to explore and characterize different sets of spatial disease patterns at a very fine geographical resolution . As a result, disease mapping has been widely used in epidemiology and public health research . The use of geographical mapping helps the detection of areas with high disease incidence for which usually neighbouring areas show similar factors. One common application of disease mapping has been in describing the variation in health outcomes over geographic regions. However, mapping of crude disease rates can be quite misleading particularly at a small area level. This is often due to the combination of two factors: small regional incidence counts and the presence of spatial correlation in the rates. Low prevalence diseases do not provide a possibility of obtaining stable estimates at the district level. For high prevalence diseases like malaria however, these estimates are easily attained due to the availability of a large amount of information at the district level.
Different approaches have been used to model spatio-temporal problems, starting from work by  in which space-time interaction is realized by assuming area-specific linear time trend for relative risks. Many other researchers [7, 8] proposed and implemented space-time models with different interactions. Spatial and temporal malaria variation is studied in  with an investigation of possible geographical expansion of malaria transmission. Space-time models using malaria data are investigated in research by [10, 11] where they use dynamic and Bayesian models respectively. Climatic variables are then used as covariate predictors of malaria incidence risk.
The problem of incomplete data, i.e. missing of some explanatory variables. Multiple imputation approach to missing data is pursued.
Gelfand et al  define the change of support problem (COSP) as relating to the inference about the values of a variable measured at different levels of spatial aggregation from those at which it has been observed. In this study, the COSP is addressed by interpolating these factors (covariates) through a Gaussian process.
To provide a spatio-temporal analysis of malaria incidence risk;
To determine the contribution of predictors/covariates in the variation of malaria incidence risk
Environmental data used in this study is collected at monitoring stations located in five out of eight districts in Maputo province. This a typical situation of change of support (misaligned) problem with environmental factors observed at fixed locations s (point referenced data), whereas the malaria cases are observed at district level. Different approaches are proposed by Zhu and colleagues  to tackle the problem of misaligned, namely: predictions from points to points, points to blocks and blocks to blocks. Their work is supported by an application to a static spatial case using the dataset of point-level ozone measurements in the Atlanta metropolitan area. Same researchers in  investigated further, by looking at the relationship between ambient ozone and paediatric emergency room visits. The application is extended to spatio-temporal model with log-ozone modelled by a stationary Gaussian process. Before attempting to model misaligned data it was necessary to address the problem of missing data.
In the process of data analysis, it is commonplace to observe that data for each case are not always complete. Rather, some data are usually missing. The amount of missing data may be minimal for some cases; in others perhaps significant. Problems dealing with the analysis of missing data have been extensively reviewed in the scientific literature [14–16]. Environmental time series generally have as their main focus physical and chemical measurements. Hence, problems such as reserved information, privacy respect and non-response as for example in social and medical surveys are not present. The main sources of missing records considered in this study are:
Break-down of measurement instruments;
Maintenance interventions, and
Overall estimated mean and variance of imputed data.
In the execution of space-time model with one imputed data set, the results obtained for the main parameters were similar to the run of the model with the data obtained as average values of the five datasets. This assures us that the procedure applied for MI produced consistent results.
The change of support problems of interest includes predicting rain, temperature (minimum, mean and maximum) and relative humidity measurements at different points on the map and from that moving to prediction of average weather parameters at district level. Although there are at least twenty-six weather stations registered in Maputo province, on average only five or six stations are operational at any time. The sites are daily monitored within the same time interval and data is aggregated for every four weeks (month). Hence, it comprises time series of spatial processes as the time scale is equally spaced. It is reasonable to consider the change of support problem only in space .
A spatial covariance structure with no nugget effect is used, with Σ specified as Σ = σ 2 H(ϕ) where (H(ϕ)) ij = ρ(ϕ; d ij ). Being the d ij = || s i - s j || distance between points s i and s j and ρ is an ordinary exponential function. To complete model specification independent priors are assigned to the parameters namely, a multivariate normal for β, inverse gamma for σ 2 with parameters a = 0.004 and b = 0.02, and a gamma prior for ϕ with mean 0.12 and variance about 0.05. The amount of each environmental factor predicted in every district i was estimated as the mean of the posterior predictive distribution of the random field process at points of the grid that have fallen within that district. This procedure was independently repeated for every year r = 1999, ..., 2008.
Results of bi-variate analysis of prediction variables.
Counts of malaria are registered daily at different health centres and rural hospitals generating Weekly Epidemiology Bulletin. They are collated and summarized by each district government health department and reported to provincial health Officers' monthly. These summaries are sent to the Ministry of Health and shared with different disease control programmes. Expected cases are taken as being the population of each district in a corresponding year.
where α is a measure of overall incidence (intercept term), θ i is the spatial random effect and φ rt is the monthly temporal random effect for each year. δ rit is defined as space and time interaction term with β and X rit being vectors of regression coefficients and environmental covariates respectively. The spatial dependence is introduced through the conditional autoregressive (CAR) process.
In the CAR model, the conditional distribution of each θ i given all other θ's is a normal distribution with mean equal to the average of θ's of its neighbours, and precision proportional to the number of neighbours. Hence, a neighbourhood structure needs to be defined and supplied to the model through matrix W. This matrix is important as it specifies how much influence neighbouring districts will have on district i. Figure 1, illustrates the location of each district where can be noticed that the shapes and the lengths of their boundaries vary quite a bit among districts.
Binary structure with ω ij = 1 for neighbouring districts i and j, and ω ij = 0 otherwise;
Weighted by the length of the boundary, i.e. with ω ij equal to the border length (in km) for districts i and j, and ω ij = 0 for districts not sharing common boundary. In this case the effect of neighbouring district varies according to the extension of its boundary.
and the weight matrix Q defines the temporal neighbours of month t as being months t-1 and t+1 for t = 2, ..., 11; with months t = 1 and t = 12 having singular neighbours.
The space-time interaction terms δ rit capture departure from space and time main effects which may highlight space-time clusters of malaria risk. In the present study they are assumed to be independent for every year and month with a constant variance over time. This is captured by an auto-regressive AR(1) prior process. It is parameterized by a temporal variance that allows for correlation between consecutive months within the same district, i.e. assumes that cases at month t are influenced by cases of month t-1. This relationship holds for months in same year and also for December-January relation of consecutive years, except for January in first and December in last year. A uniform prior was specified to the intercept term and a standard normal prior for coefficients β with high variance. Spatial and temporal random effects variance parameters were specified inverse gamma hyper-prior distributions.
Model fitting used Markov Chain Monte Carlo (MCMC) simulation techniques implemented in Winbugs with employment of two parallel consecutive chains. A burn-in of 30,000 iterations was allowed where values of main parameters were stored. Diagnostic tests for convergence of stored variables were undertaken, including the analysis of the Brooks, Gelman and Rubin statistics and visual examination of history and density plots, and by computing Monte Carlo errors (MCE). The value MCE/SD was less than 0.05 and thus concluded that sufficient iterations had been conducted. This was followed by a further 30 000 iterations run to obtain posterior distributions of the parameters.
Comparison of fitted models.
(A) - Full model
(B) - Full model
(C) - Non-spatial
Posterior estimates of intercept α, environmental regression coefficients β, of spatial and spatio-temporal variances obtained by fitting model B (Table 3), including 95% credible intervals.
Intercept ( α )
Minimal Temperature (°C)
Maximum Temperature (°C)
Relative Humidity (%)
Spatial variation ( )
Spatio-temporal variation ( )
The effect of humidity and maximum temperature on malaria incidence risk is very high as illustrated by their mean posterior predictive P-value in Table 4. The mean P-value decreases below the critical values as the maximum temperature levels increases. For humidity predictor the mean P-value shows similar pattern. The values estimated by a posterior Bayesian model show a marked variation compared to the Standard Mortality Ratio as shown on maps (See Additional files 1 and 2). There are no similarities in general for regions with both higher SMR and estimated malaria risk. It can be seen that for the months of May to July malaria incidence is very stable. The trends of incidence in the districts of Matutuine and Namaacha are generally lower compared to other districts except for August where this trend increases in the district of Namaacha. For the district of Namaacha, this could particularly be due to being located at higher altitude compared to other districts. Most of its administrative posts lie 400 meters above sea level while the other districts are around 50 meters and surrounded by several water basins. The model using border lengths for weighting matrix did not improve model performance as it is shown by results of DIC analysis in Table 3. It only over-performed the model with no spatial dependency, i.e. non-spatial model.
This research have analysed malaria cases data from spatio temporal perspective to identify significant predictors associated with malaria incidence risk and to produce contemporary smoothed maps of disease risk in Maputo province. Maps of smoothed space-time malaria incidence have been produced in several studies [9–11]. Besides the application of techniques of data multiple imputation and spatial alignment in a typical problem analysing the incidence of malaria in Mozambique, this study implements the Bayesian models for analysis with inclusion of temporal random effects and space-time interaction terms.
The problem of missing data is a major issue during the analytical process of any study. This is normally addressed by applying imputation techniques. They follow into two categories: single imputation and multiple imputations (MI). The first has been subjected to increasing criticism by researchers due to its tendency of introducing bias and underestimating standard errors . However, if the quantity of missing values is very small (less than 5%) this methodology can in general be considered accurate. The procedure of multiple imputations is a more general method for inference with missing data. It replaces each missing record with multiple plausible values instead of a single replacement of missing observation.
Missing Completely at Random (MCAR): in terms of analysis, no difference established between missing and not missing cases.
Missing at Random (MAR): missing data is fully described by variables observed in dataset.
Missing Not at Random (MNAR): data missing in an unmeasured fashion termed "non-ignorable"
The establishing of main source of missing records helped to determine and identify the MAR as the most appropriated missing data mechanism underlying the environmental data incompleteness in this study.
The spatial pattern of malaria showed that to the north of Maputo province there is a more pronounced pattern of incidence. In contrast, sub-regions to the centre and south exhibit levels of relatively lower incidence. The main hypothesis for these results could be occurrence of other factors such as indoor pulverization, proximity to water basins, etc. However, the absence of this information has prevented the inclusion of these variables in the analysis. Furthermore, as malaria has a certain period of latency it would ideally be not to include the information about for example present indoor pulverization, but this activity in the past. On the other hand, it is not observed a single temporal gradient of malaria relative risk, with some areas showing a decrease and others exhibit a relative increase. Furthermore, the quantification of relative amount of spatial risk pattern has helped highlighting districts with low and high proportions of malaria incidence at a given time period . In addition, this study may also contribute to the evidence of the importance of spatial and temporal smoothing of random effects in mapping malaria [9, 11, 21–23].
This study showed that the combination of the monthly maximum temperature in the range 28 to 35°C and relative humidity in the range 54.5% to 83% provided suitable condition for malaria transmission. The negative association attained by maximum temperature in the range of 24 to 27°C to malaria incidence, could indicate a need of warmer temperatures for malaria transmission . The performance of rainfall in the analysis could be influenced by the presence of humidity covariate. High levels of humidity is generally observed when temperature and rainfall are also high, thus leading to suitable conditions of parasite development due to available breeding sites and survival of mosquitoes population .
The mapping of averaged smoothed incidence malaria risk for each month and ten years period allows a visually display for months of initial and peak transmission. See Additional files (Additional Files 3 and 4). This may provide information on the length of transmission based on the predicted relationship with the included covariates. Although this study does not present seasonal analysis of malaria incidence variation as in , the monthly variation illustrates some seasonal pattern in months May-July (usually considered part of winter period in Mozambique), where the warmer temperatures may have induced the reduction of the die-back mosquitoes and parasite levels, increasing substantially their availability in the following months. Nevertheless, the climate remains the main limiting factor of malaria intensity controlling transmission at both spatial and temporal dimension [25, 26].
In conclusion, the models applied in this study adjusted for unobserved spatial and temporal variation on risk factors, while allowing for inter-monthly and inter-annual variation in malaria incidence to be influenced by environmental conditions. Nevertheless, the variation on incidence malaria risks could also be affected by other factors not considered in the analysis. These results may be useful for developing of climate based malaria surveillance systems in Mozambique which can help bring a better management and implementation of nation-wide malaria control programmes, by guiding public and private policies towards reducing malaria incidence in Maputo province. Variation from normal monthly minimal temperature and rainfall patterns in this study, showed their limited use for predicting malaria incidence in Maputo province.
Markov Chain Monte Carlo
Deviance Information Criteria
National Malaria Control Program
Mozambique National Meteorology Institute
Universidade Eduardo Mondlane
Swedish International Development Cooperation Agency
Weekly Epidemiological Bulletin
First Autoregressive Process
Thanks to SIDA and UEM - Project for Global Research in Mathematics, Statistics and Informatics for supporting this research.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.