Lineage divergence detected in the malaria vector Anopheles marajoara (Diptera: Culicidae) in Amazonian Brazil
© McKeon et al. 2010
Received: 15 April 2010
Accepted: 7 October 2010
Published: 7 October 2010
Skip to main content
© McKeon et al. 2010
Received: 15 April 2010
Accepted: 7 October 2010
Published: 7 October 2010
Cryptic species complexes are common among anophelines. Previous phylogenetic analysis based on the complete mtDNA COI gene sequences detected paraphyly in the Neotropical malaria vector Anopheles marajoara. The "Folmer region" detects a single taxon using a 3% divergence threshold.
To test the paraphyletic hypothesis and examine the utility of the Folmer region, genealogical trees based on a concatenated (white + 3' COI sequences) dataset and pairwise differentiation of COI fragments were examined. The population structure and demographic history were based on partial COI sequences for 294 individuals from 14 localities in Amazonian Brazil. 109 individuals from 12 localities were sequenced for the nDNA white gene, and 57 individuals from 11 localities were sequenced for the ribosomal DNA (rDNA) internal transcribed spacer 2 (ITS2).
Distinct A. marajoara lineages were detected by combined genealogical analysis and were also supported among COI haplotypes using a median joining network and AMOVA, with time since divergence during the Pleistocene (<100,000 ya). COI sequences at the 3' end were more variable, demonstrating significant pairwise differentiation (3.82%) compared to the more moderate 2.92% detected by the Folmer region. Lineage 1 was present in all localities, whereas lineage 2 was restricted mainly to the west. Mismatch distributions for both lineages were bimodal, likely due to multiple colonization events and spatial expansion (~798 - 81,045 ya). There appears to be gene flow within, not between lineages, and a partial barrier was detected near Rio Jari in Amapá state, separating western and eastern populations. In contrast, both nDNA data sets (white gene sequences with or without the retention of the 4th intron, and ITS2 sequences and length) detected a single A. marajoara lineage.
Strong support for combined data with significant differentiation detected in the COI and absent in the nDNA suggest that the divergence is recent, and detectable only by the faster evolving mtDNA. A within subgenus threshold of >2% may be more appropriate among sister taxa in cryptic anopheline complexes than the standard 3%. Differences in demographic history and climatic changes may have contributed to mtDNA lineage divergence in A. marajoara.
About 1.8 million species are known on Earth, including more than 1 million insects, 250,000 higher plants and 69,000 fungi . The Amazon comprises much of this biodiversity and is considered the largest gene reserve in the world, with an estimated 14% of all plant and animal species within its boundaries . Because speciation is not always accompanied by morphological change , the true number of biological species is likely to be greater than current estimates . Genetic analysis plays an increasingly important role in identifying changes in population structure and elucidating taxonomic status and phylogenetic relationships.
Genetically divergent but morphologically cryptic species have been described in many aquatic organisms , birds [6, 7] and insects [8, 9], particularly among mosquitoes in the Dipteran genus Anopheles [10–12]. The Neotropical Albitarsis Complex contains at least six species, only some of which are documented malaria vectors: Anopheles marajoara, a local and regionally important vector in lowland rainforest [13–15], Anopheles janconnae (previously Anopheles albitarsis E) implicated in local transmission in Amazonian savannah ; and Anopheles deaneorum, which appears to consist of multiple species , and is a potential vector, as demonstrated by comparative susceptibility laboratory studies [18, 19].
Various methods have been used to investigate species delimitation and identifications in the Albitarsis Complex [11, 20, 21], culminating, most recently, in the recognition of six species , but see Bourke et al . Genealogical analyses of complete mtDNA COI (Cytochrome oxidase I) sequences found that A. marajoara is paraphyletic, and may consist of at least two phylogenetic species  or lineages [3, 23], one of which may be A. janconnae.
The mitochondrial genome has been used extensively in studies of molecular evolution  and the COI gene has resolved evolutionary relationships among closely related species for a wide range of taxa [25, 26], including insects [27, 28] and cryptic species complexes [29, 30]. The Folmer region, 648-bp at the 5' end of the COI mitochondrial gene, has emerged as the standard barcode region [31, 32]. Interspecific divergence within insects almost always exceeds 3% and this value has been used as a speciation threshold . The true test of DNA barcode precision would include comparisons with sister species . The utility of DNA barcoding among insects is still being debated, because of success in revealing cryptic species [31, 35, 36] on one hand, and the inability to reliably detect species boundaries on the other [36–38]. An estimated 20% failure rate has been noted at the species level due to non-monophyly , which increases among insects due to overlapping ranges of intra- and interspecific sequence divergences . Together these findings suggest that the threshold level could be set lower than 3% to minimize false negatives .
Accurate morphological identification of the adult females of species within the Albitarsis Complex is virtually impossible, and the ITS2 has become a recognized molecular tool for identification [21, 41, 42]. ITS2 variation is low within a species due to homogenization and fixation while the overall fragment length is generally variable between species [20, 43–45]. As such it can usually resolve phylogenetic relationships at different taxonomic levels, and detect recently diverged taxa such as sibling species of mosquitoes .
Members of a species are rarely distributed homogeneously in space, and population subdivision can occur in response to geographical boundaries, social behaviour and genetic variation . Patterns in biological sequence data that arise from ancestry can be useful in determining the structure and boundaries of a given species . The objective of this study was to test the hypothesis of paraphyly  using a combined data set (white gene + COI) and to evaluate the population structure of A. marajoara to address the following questions: 1) is the proposed paraphyletic status supported; 2) what is the level of genetic differentiation between populations; 3) can lineages of A. marajoara be distinguished by the Barcode of Life (BOLD) 3% species threshold; and 4) can genetic differentiation be explained by demographic phenomena, geographic boundaries and (or) natural selection.
Anopheles albitarsis s.l. collection information
Serra do Navio
A 1200-bp fragment of the COI gene was amplified using the forward primer UEA3 and the reverse primer UEA10 . An 822-bp fragment from the 3' end using PCR primers: 2195D (5'-TGATTYTTTGGTCATCCNGAAGT-3'; a modification of C1-J-2195 for amplification in Collembola ) and FLY10A (5'-AATGCACTAATCTGCCATATTAG-3'; a modification of TL2-N-3014 for amplification in Simuliid flies ) was previously amplified for the northeastern collection samples . Individual PCR reactions were preformed using Ready-To-Go-PCR bead (Amersham Pharmacia/Biotech, NJ, USA) and run on a PTC 100 or 200 series thermal cycler (Biorad, Inc.), or a PTC-100 (MJ Research, Inc.), using the conditions stipulated in Mirabello and Conn . The Applied Genomics Technology Core (Wadsworth Center) carried out the sequencing. The forward and reverse COI sequences were aligned using Sequencher 3.0 (Gene Codes Corps, MI, USA), grouped together by sight and trimmed in PAUP*, version 4.0 . The complete overlap of both COI primer sets created a 488-bp fragment (Additional file 1).
An 800-bp fragment of the white gene was amplified using the W2R and WF primers, with the PCR conditions as reported in Mirabello and Conn . A 500-bp fragment of the ribosomal ITS2 was amplified using the primers 18S and 28S with the parameters in Li and Wilkerson . The PCR products were cleaned, sequenced and aligned creating a 496-bp fragment of the white gene and 361-bp of the ITS2 with complete forward and reverse overlap for each marker. Unique sequences for all markers are available in GenBank [GenBank: HQ026025-HQ026113].
All A. albitarsis s.l. sequences were identified to species by multiple gene (COI, white, ITS2) comparisons to those deposited in GenBank: A. albitarsis [DQ076207/DQ076208; AY956299/AY956300; AF462386/AF462387], A. oryzalimnetes, formerly A. albitarsis B [DQ076210-DQ076215; AY956297/AY956298; U92333], A. marajoara [DQ076216/DQ076217-DQ076221/DQ076225; AY956295/AY956296; U92334], A. deaneorum [DQ076226/DQ076229; AY956301/AY956302; AF461751/AF461752], A. janconnae, [DQ076231/DQ076232] and A. albitarsis F [DQ228314/DQ228315].
Genealogical trees were estimated using the concatenated (white + COI) data set. ITS2 sequences were excluded from these analyses because of relatively limited sample size. Maximum Parsimony trees were generated in PAUP*, with one hundred replicates of a heuristic search performed with an initial random stepwise addition of sequences and TBR branch swapping. Branch support was estimated from 1,000 replicates of a bootstrap search. Bayesian inference (BI) analysis was performed with Mr. Bayes version 3.1 [56, 57], partitioned by gene, using the model of nucleotide substitution (TPM3uf+G and TIM1+I) that best fit the white gene and COI respectively, determined with jModelTest [58, 59]. The settings were two simultaneous, independent runs of the Markov Chain Monte Carlo (MCMC) for 4 million generations, sampling every 1,000 generations with a 'burnin' of 25%. The outgroup A. albimanus in the Albimanus Section of Nyssorhynchus (unpublished sequence) was chosen based upon its phylogenetic position in Sallum et al . Estimates of time to coalescence were calculated for the COI fragment only and compared using θ S values  and BEAST .
Genetic structure of two lineages was examined by analysis of molecular variance (AMOVA), a method of estimating population variance directly from molecular data, using Arlequin version 3.1.1 . In addition, spatial analysis of molecular variance (SAMOVA), version 1.0  was used to cluster the 488-bp COI sequence data into genetically and geographically homogeneous populations. SAMOVA generates F statistics (F SC , F ST , F CT ) using the AMOVA approach, into K groups to maximize the between group variation. SAMOVA estimates were computed for K = 2-13 with 1,000 simulated annealing steps from each of 100 sets of initial starting conditions.
A comparison of the pairwise divergences calculated by DnaSP between COI lineages using the 648-bp 5' end and the 488-bp 3' end tested the utility of the Folmer region and the 3% species threshold using 10 to 13 individuals from each lineage, three closely related taxa and an outgroup (Anopheles darlingi). Anopheles darlingi is more appropriate than A. albimanus for sister taxa comparisons as both it, and A. marajoara, are in the Argyritarsis Section . A species screening threshold (SST)  may be more appropriate and can be calculated as 10× the mean intraspecific variation for a group . The standard sequence threshold was calculated and examined for Nyssorhynchus using 4-58 archived GenBank sequences for known species.
A statistical parsimony network estimated genealogical relationships among COI haplotypes with a 93% sequence identity and 95% identity for each of the nuclear markers using TCS 1.13 . Homoplasy in all networks was resolved using the algorithm estimation rules in Crandall and Templeton .
The differentiation and polymorphism statistics for COI sequences by species or lineage and locality were computed in DnaSP, Version 4.0 , and the hypothesis of strict neutrality was examined using the statistics D T , D and F , and R 2  which are based on the frequencies of segregating sites, and Fu's F S , based on the haplotype distribution. Tajima's D and both Fu and Li's D* and F* are the most effective tests to detect background selection, whereas Fu's F S and R 2 are among the most powerful tests to detect population expansion . All neutrality tests were calculated using DnaSP, Version 4.0 or MEGA version 3.1 .
The mismatch distribution (simulated in Arlequin) is a frequency distribution of the observed number of pairwise nucleotide differences. The shape of the distribution is highly informative and able to differentiate between a population expansion and equilibrium , whereas the smoothness (raggedness statistic) distinguishes the fit of the empirical data to the model . Statistically significant differences between observed and simulated distributions were evaluated with the sum of square deviations (SSD) to reject the hypothesis of demographic expansion .
The estimated time to coalescence for the two COI lineages was obtained using the equations, θ s = 2Neμ,  and 4N e = years since coalescence . Based upon the 488-bp fragment, θ s is 7.530 (SD = 1.857). Therefore, the estimate of the time to coalescence is 232,476 - 384,716 years ago, in the Pleistocene. A similar estimate using BEAST, the standard arthropod mtDNA mutation rate of 2.3% per million years , an applied HKY model with gamma distribution assuming constant size and a relaxed molecular clock, yielded an estimate of 3.07 mya. The discrepancy between coalescent estimates is likely due to the 10 generations per year that is factored into the Walton equation . Although the incorporation of fossil calibration points can generate more reliable divergence estimates , no data were available because the mosquito fossil record is too recent and rare .
COI K2P intraspecific nucleotide difference of Nyssorhynchus species using the fragment of the Folmer region available in GenBank with current A. marajoara and individual lineage comparisons below the double line.
Intraspecific difference mean (SD)
A. albitarsis s.s
0.010 (± 0.003)
0.004 (± 0.002)
0.015 (± 0.003)
0.014 (± 0.003)
0.015 (± 0.003)
0.012 (± 0.003)
0.010 (± 0.003)
A. marajoara s.l
0.024 (± 0.003)
0.016 (± 0.002)
DQ07616, DQ076218-20, DQ076222-24
0.014 (± 0.002)
DQ076217, DQ076221, DQ076225
Inter- and intra-population differentiation of A. marajoara lineages and populations on either side of the SAMOVA generated boundary.
COI (488-bp fragment)
white (496 bp)
Lineage 1 vs. 2 ( N = 289)
Lineage 1 E vs. W ( N = 209)
East vs. West
K s *
Analysis of mitochondrial data from 294 samples among seven riverine and seven non-riverine localities in Amazonas, Pará, and Amapá states, Brazil (Figure 3) led to the discovery of two co-occuring, but distinct A. marajoara lineages that are unable to be connected using statistical parsimony (Figure 2). Therefore, a median joining network that uses alternate algorithms to remove homoplasy was conducted and indicated that a minimum of 13 mutational steps separate the lineages, shown in Figure 2. Seven fixed mutations were estimated between lineages based on DnaSP. Additionally, A. oryzalimenetes was identified from Itaituba and from Macapá, a new record (Figure 3).
COI haplotypes were detected (Figure 2). Of the A. marajoara haplotypes, 26 (44.8%) were shared among localities, and 32 (55.2%) were unique (Additional file 3). The most common haplotypes were 1 (n = 16), 2 (n = 26), 3 (n = 20) and 5 (n = 78) of A. marajoara lineage 1, and haplotypes 43 (n = 9) and 46 (n = 9), 50 (n = 32), and 56 (n = 11) of A. marajoara lineage 2 (Figure 2). There was a relatively high proportion of singletons (32, 55.17%) in the A. marajoara lineages combined, with 24/42 for lineage 1, and 9/16 for lineage 2. Overall, both A. marajoara lineages were indicative of a category II pattern  characterized by pronounced genetic gaps between some branches and the co-distribution of principle lineages over a wide area, which could theoretically arise in a species with large evolutionary N e (effective population size) and high gene flow. Lineage 1 contains a star shaped node surrounding haplotype 5, with short branches and an excess of singleton mutations, predominantly from the northeastern localities, which suggests a demographic expansion, background selection or selective sweep [74, 84]. In contrast, lineage 2 indicates balancing selection with its longer branches, missing haplotypes and near equal distribution of shared haplotypes and single mutations, consistent with a signal of an older lineage. It could be argued that haplotype 50 (lineage 2, Figure 2) and the 5 singletons that arise from it also constitute a star-shaped node, which suggest recent expansion. Lineage 2 appears to be restricted to the westernmost localities.
Results of neutrality tests based on COI sequences of A. marajoara from Amazonian Brazil.
The combined sequence results support divergence in A. marajoara depicting separate lineages, distinct from A. janconnae. The more ancestral lineage 2 appears monophyletic, whereas moderate substructure within lineage 1 suggests paraphyly. Nuclear markers, however, consistently depict a single A. marajoara lineage. Discrepancies between mtDNA and nDNA are not enough to refute speciation, for example the newly recognized species A. janconnae and A. marajoara form a single lineage using the protein coding nuclear white gene and ITS2 sequences [20, 21], but complete COI sequences  indicate monophyly of each taxon, and morphometrics analysis depicts A. janconnae as a separate taxon in the complex . Nuclear mutation rates are generally slower than mitochondrial ones , but see Hellberg , thus the lack of divergence detected in the nuclear sequences in A. marajoara in the present study suggests either the differentiation is restricted to the mitochondrial genome, or it is recent and not yet visible in the nuclear genome. Given that lineage 1 has a broader distribution and includes samples from near the type locality, lineage 1 would be more appropriately named A. marajoara s.s. The status of lineage 2 is not yet resolved; it may be a new cryptic species belonging to the complex as suggested by Wilkerson et al. (2005) from Manaus, Brazil  and confirmed by JF Ruiz (personal communication) using the COI barcode region of the Albitarsis subgroup across South America.
Genetic variation between lineages was explained by 61.94% between group differences. Pairwise divergence estimates based on the 488-bp fragment confirmed 3.82% divergence between lineages (Table 3), which is comparable to the values observed among well-known Albitarsis Complex species (for example, 3.52% between A. albitarsis s.s and A. oryzalimnetes). These data may support cryptic or incipient speciation . The effectiveness of the Folmer region and the BOLD threshold of 3% are questionable for at least some species of Nyssorhynchus because there is only moderate support between known species A. oryzalminetes and A. albitarsis s.s. If the average intraspecific variation of the Folmer region for the subgenus (11.6%) is used as a species threshold, the pairwise divergences of all sister taxa (including A. darlingi and A. albimanus) do not exceed this standard. Such a high SST is likely to result in false negatives even among recognized species, therefore separate intraspecific values and interspecific divergences were examined for resolution, where intraspecific COI variation was <2% and interspecific variation was between 6-10%, comparable to other anopheline species differences [[92, 93], unpublished data].
Pleistocene divergence in mtDNA has also been detected for a wide range of insect taxa across Central and South America, including L. longipalpis , H. erato , R. prolixus , and mosquitoes A. darlingi  and A. albimanus  suggesting climatic changes may be a common force driving Neotropical speciation. Pleistocene divergence coupled with paleoclimate records of the Miocene describing the world as warmer and exhibiting greater humidity and precipitation ;  support the more ancestral depiction of lineage 2. Pleistocene climatic changes, including lower temperature, precipitation and carbon dioxide levels than those exhibited today, may have contributed to the divergence of lineage 1 [98, 99].
Lineages co-occur along the Amazon River with no obvious species diversity gradients, although data indicate several "high diversity localities" including Santarém, which is a recognized hotspot. However, this may be an artefact resulting from the relative logistical ease of access to these localities and the many studies that have been conducted there . The geographic barrier detected within A. marajoara lineage 1 is in the same region as the mtDNA divisions between northeastern and central western Amazon populations in A. darlingi , A. nuneztovari  and Atta species ants . With similar distributions and geographic boundaries noted in both mosquitoes and ants, it seems unlikely that the boundary is the result of dispersal abilities. The detection of shared haplotypes in the present study on both sides of the barrier indicates a permeable boundary, suggesting the more geographically restricted lineage 2 may be an artefact of incomplete sampling, or perhaps a result of climate variation.
Climate, particularly rainfall, is a strong descriptor of broad-scale species-richness patterns in the tropics . Vasconcelos et al  noted a restricted geographic distribution of ants in the western part of the basin, which to a great extent reflected a west to east gradient of decreasing rainfall. Additionally, Sombroek  explained an intricate pattern of continuous rainfall in the western interior of the Amazon, in contrast to the coasts, with a pronounced dry season containing a dry belt or corridor occurring near Rio Jari, Amapá state (near the SAMOVA barrier for A. marajoara lineage 1). Miocene flooding  and subsequent Pleistocene climatic events may explain the porous nature of the barrier.
Recent deforestation and habitat fragmentation resulting from the 1967 initiation of the single largest forest plantation in Latin America, an agro-forestry venture known as the Jari project  in northern Brazil could have contributed to an increase in A. marajoara abundance in this region. Clearing of forest, in combination with an increase in human and domestic animal host abundance, may have led to an A. marajoara population increase in Macapá, the capital of Amapá -state, in northeastern Amazonian Brazil .
Among the riverine localities, the hypothesis of sudden population expansion could not be rejected, although the mismatch distributions were smooth and bimodal for both lineages. The reduced genetic variation of A. marajoara lineage 1 in the northeastern Amazon region coupled with the unimodal distribution and the star-like pattern of the haplotype network are consistent with a rapid population expansion in the northeast. A. marajoara tends to breed in marshy sunlit pools and may have expanded into the northeast where savannah and agricultural habitats would be favorable. The population expansion occurred during the Holocene, possibly indicating a recent colonization. This explanation is supported by the expansion of savannah during periods of fire-associated drought and extinction-recolonization of rainforest tree populations in lowland areas during the Holocene [108–110]. Moderate gene flow in lineage 1 may hamper local vector control in Amazonas, Pará and Amapá states because of potential reinvasion and the spread of insecticide resistant genes .
Anopheles marajoara is an important vector in the savannah areas of South America , and with some of the highest incidences of malaria occurring in the states of Pará and Amapá [14, 15, 112] it is likely that A. marajoara will continue to increase its regional importance in malaria transmission. Additional studies will be needed to determine whether the subdivision resulting in the two co-occurring lineages in A. marajoara is strictly the product of genetic variation and evolutionary processes impacted by geographical barriers, or variations in ecology and behaviour that may have lead to niche divergences and microallopatry [113–116]. If both lineages are implicated as vectors, the characteristics of their breeding sites, their behaviours and their migration history could be used to predict changes in malaria transmission patterns in the Amazon basin  and other endemic regions, and provide useful information for targeted vector control strategies.
We thank M Povoa's research group at Instituto Evandro Chagas/SVS/MS in Ananindeua, Pará, Brazil as well as Rosa and Roger Hutchings, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil, for assistance in mosquito collection and logistics. We also appreciate the help of W. Kilpatrick, UVM, with analysis and interpretation of the eastern Amazon data early on. Funding for this study was provided by Instituto Evandro Chagas, Ananindeua, Pará, Brazil and NIH grants 1T32AI05532901A1, "Training in Biodefense and Emerging Infectious Disease" and NIH 2R01 A154139 to JEC. Permission to collect in the Amazon basin was provided by the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq) processo: CMC 028/04 - Expedição Científica and IBAMA RMX 021/04 (Projeto: ?Biologia e sistemática de vetores de malária no Brasil: genética e ecologia?).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.