Identification of a fibrinogen-related protein (FBN9) gene in neotropical anopheline mosquitoes

Background Malaria has a devastating impact on worldwide public health in many tropical areas. Studies on vector immunity are important for the overall understanding of the parasite-vector interaction and for the design of novel strategies to control malaria. A member of the fibrinogen-related protein family, fbn9, has been well studied in Anopheles gambiae and has been shown to be an important component of the mosquito immune system. However, little is known about this gene in neotropical anopheline species. Methods This article describes the identification and characterization of the fbn9 gene partial sequences from four species of neotropical anopheline primary and secondary vectors: Anopheles darlingi, Anopheles nuneztovari, Anopheles aquasalis, and Anopheles albitarsis (namely Anopheles marajoara). Degenerate primers were designed based on comparative analysis of publicly available Aedes aegypti and An. gambiae gene sequences and used to clone putative homologs in the neotropical species. Sequence comparisons and Bayesian phylogenetic analyses were then performed to better understand the molecular diversity of this gene in evolutionary distant anopheline species, belonging to different subgenera. Results Comparisons of the fbn9 gene sequences of the neotropical anophelines and their homologs in the An. gambiae complex (Gambiae complex) showed high conservation at the nucleotide and amino acid levels, although some sites show significant differentiation (non-synonymous substitutions). Furthermore, phylogenetic analysis of fbn9 nucleotide sequences showed that neotropical anophelines and African mosquitoes form two well-supported clades, mirroring their separation into two different subgenera. Conclusions The present work adds new insights into the conserved role of fbn9 in insect immunity in a broader range of anopheline species and reinforces the possibility of manipulating mosquito immunity to design novel pathogen control strategies.


Background
Mosquito-borne diseases, including malaria and arboviruses, such as dengue, depend on complex interactions among pathogens, insect vectors, and hosts. Studies of vector immunity are of particular importance to understanding these complex interactions and could lead to the development of novel disease control strategies [1][2][3][4][5][6]. Vectorial competence, which refers to the ability of arthropods to acquire, maintain, and transmit microbial agents [7], is directly related to insect immunity. Several immunity-related genes have been identified in Old World vectors [8][9][10]. However, related studies in neotropical anopheline species are still incipient.
Among the important immunity genes are the ones enconding members of the fibrinogen-related protein family (FREP or FBN), which are pattern recognition receptors and have been considered as promising candidates for parasite control strategies [10][11][12]. Within this family, the fbn9 gene was found to be upregulated when Anopheles gambiae mosquitoes were fed on blood infected with parasites (Plasmodium falciparum) or bacteria (Escherichia coli or Staphylococcus aureus) [10]. Furthermore, when this gene was knocked-down, parasite loads significantly increased [10]. More recently, the fbn9 gene was found to be conserved among members of the An. gambiae complex [13]. The An. gambiae complex comprehends seven closely related species (An. gambiae, Anopheles arabiensis, Anopheles melas, Anopheles merus, Anopheles bwambae, and Anopheles quadriannulatus A and B), from which An. gambiae and An. arabiensis have been described as important vectors of human malaria [14].
The identification and characterization of the fbn9 gene partial sequences from four species of neotropical anopheline mosquitoes has been performed in this study, followed by comparisons to sequences of the An. gambiae complex available in public databases. Further comparisons of synonymous (silent) and non-synonymous (changing) substitution rates in its amino acid sequence have been applied to try understanding its molecular evolution. This study allowed a better understanding of the molecular diversity and predicted function of this immunity gene in a broader range of mosquito species.

Mosquito collection and identification
Mosquitoes were collected from different locations (60-70 specimens from each locality; Table 1). Anopheles darlingi specimens were collected from larval breeding sites and by capturing adults through traps in rural areas of Manaus (Amazon, Brazil). Larvae were also collected and maintained in rearing conditions and adults identified upon emergence. Anopheles albitarsis samples were also collected in Porto Velho (Rondonia, Brazil). Anopheles aquasalis was obtained from a laboratory colony reared at 27°C, 80% humidity and 12h L:D cycle maintained at FIOCRUZ, Belo Horizonte-MG, Brazil. Mosquitoes were individually identified according to their morphology through taxonomic keys, which are based on particularities on their tarsi, abdomen, and wing veins [21]. The identification of specimens of the An. albitarsis complex was confirmed by sequencing the internal transcribed spacer 2, ITS2, as described below.

ITS2 genotyping
Genomic DNA has been isolated from pools of 10 mosquitoes following previously published protocol [22]. PCR reactions were performed using the primers CP16 (5'-GCGGGTACCATGCTTAAATTTAGGGGGTA-3') and CP17 (5'-GCGCCGCGGTGTGAACTGCAGGACA-CATG-3') [23]. Each reaction contained 0.2 μM of each primer, 0.2 mM dNTPs, buffer (50 mM KCl, 10 mM Tris-HCl pH 8.4. 1.5 mM MgCl 2 , 1 mg/mL gelatin), milli-Q water to a final volume of 15 μL and 20 ng of genomic DNA. Samples were subjected to 25 cycles of 94°C 1 min, 50°C 2 min, and 72°C 2 min and amplified products were visualized on agarose gels. PCR reactions were purified with the ExoSAP-IT ® (USB), following the manufacturer protocol. PCR products were cloned into pGEM ® -T Easy vector system (Promega) and after colony screening, positive clones were sequenced for the ITS2 region. Sequencing reactions were performed using the DYEnamic™ ET Dye Terminator Cycle Sequencing Kit and a MegaBACE™ DNA analysis system (GE Healthcare Life Sciences).
Multiple sequences were obtained for each species (Table 1) depending on the PCR colony screening results and then further assembled. Sequences were analysed using PHRED/PHRAP/Consed [24] as well as by the Crossmatch software to trim vector sequences [25]. Sequence similarity searches were performed using the blastx program of the BLAST package against different databases [26]. Further comparisons have been performed between sequences described in the present work with those published elsewhere [23,27].

Identification of fbn9
The identification of the fbn9 gene sequences of four neotropical anophelines (Table 1) was performed through PCR and sequencing analyses as follows. First, degenerate primers were designed based upon highly similar regions between two fbn9 homologs in An. gambiae (AGAP011197) and Ae. aegypti (AY432284.1) translated sequences available in public databases. PCR reactions were performed with 0.5 μM of each primer: 5fbn_deg4 (5'-AAYCARGCNCAYYTNGARAA-3') and 3fbn_deg4 (5'-CANCCICCICCRAAYTTNGTYTG-3') with the following parameters: 94°C 5 min, 15 cycles (94°C 30 sec, 50°C 30 sec, 72°C 1 min) where the annealing temperature was reduced by 1°C per cycle; followed by 20 cycles (94°C 30 sec, 50°C 30 sec 72°C 60 sec). Part of the amplified product was checked on agarose gel. In the absence of unspecific amplified bands, PCR products were re-amplified.
Fragments from the positive PCR reactions were purified using the QIAEX II ® Gel Extraction Kit (Qiagen) and cloned into pGEM ® -T Easy vector system (Promega). Bacterial clones (E. coli -TOP10) were checked by PCR using the T7 (5'-TAATACGACTCACTA-TAGGG-3') and SP6 (5'-ATTTAGGTGACACTAG-3') primers (94°C 2 min, 35 cycles of 94°C 30 sec, 45°C 30 sec and 72°C 60 sec; followed by 5 min at 72°C). PCR products were sequenced as described above using T7 and SP6 primers. High quality consensus sequences were subjected to sequence similarity searches using the blastx program against an An. gambiae protein database built in the present work. The top BLAST hits to the An. gambiae fbn9 gene sequence (GenBank Gi: 167861677) were considered as potential orthologs for further analysis.

Sequence analysis
All fbn9 gene sequences obtained from An. aquasalis, An. darlingi, An. albitarsis, and An. nuneztovari were aligned with ClustalX [28] and manually adjusted with BioEdit [29]. Analyses of the alignment coverage, the presence of conserved domains, the number of singlenucleotide polymorphisms (SNPs), the percentages of synonymous and non-synonymous substitutions, and the rates of transitions and transversions among the nucleotide sequences were performed through MEGA software [30].
For the studies on selection pressure, another multiple sequence alignment was built including other additional 60 fbn9 sequences from the An. gambiae complex deposited at the GenBank and described elsewhere [13]. Analysis of the alignments in terms of dN, dS, and dN/ dS ratio were again performed with MEGA. To calculate the frequency of synonymous and non-synonymous substitutions, we applied the Nei-Gojobori method with the Jukes-Cantor correction [31,32].

Phylogenetic analysis
For the phylogenetic analysis, a total of 21 fbn9 nucleotide sequences from 10 species have been selected that include: four partial sequences from An. darlingi, An. aquasalis, An. nuneztovari, and An. albitarsis (Amazon and Rondonia, Brazil) obtained in the present work together with other 16 sequences from six species belonging to the An. gambiae complex (Table 1). Multiple sequences from the same species, identified as A and B, refer to different alleles of the same specimen or different specimens collected.
The use of nucleotide sequences was chosen due to the high level of conservation observed in the amino acid sequences found in our species, which would result in less informative output. Both 5' and 3' ends from the An. gambiae sequences, which were not present in our sequences, were removed from the final alignment. The final alignment contained 365 sites, which corresponds to 42.9% of the An. gambiae fbn9 gene sequence. Bayesian analysis with the Markov chain Monte Carlo (MCMC) sampling method as implemented in MrBayes (version 3.1) [33] has been performed. MCMC analyses were run as four chains (1 cold and 3 heated chains) for 1,500,000 generations, with sampling occurring every 1000 generations and with 25% of the initial samples discarded as "burn-in". The General Time Reversible (GTR) model assuming a gamma distribution with variation rate between the sites and an invariable proportion of the sites (GTR + inv + gamma) has been adopted. The sequence of An. nuneztovari was used as outgroup to construct the phylogenetic tree. Support values of the recovered trees were estimated as Bayesian posterior probabilities (pp). Well-supported clusters with the posterior probability of at least 0.80 were considered for data interpretation. The consensus tree was visualized and edited in FigTree, version 1.2 [34].

Results
In the present study, partial nucleotide sequences of an important immune-related gene (fbn9) from four neotropical anopheline mosquito species have been obtained. Sequencing of the ITS2 region of the neotropical mosquito specimens allowed the identification of species with higher accuracy in addition to morphological identification [21]. Further comparisons of sequences from neotropical anophelines with publicly available sequences from the An. gambiae complex have been performed to apply selection pressure studies and phylogenetic analysis.

Gene cloning and sequencing
The fbn9 gene was identified through PCR in An. aquasalis, An. darlingi, An. marajoara, and An. nuneztovari mosquitoes (Figure 1). Results were confirmed through DNA sequencing followed by similarity searches using blastx. All sequences matching FBN9 protein sequences as the top hit with significant E-values and high percentage of identity (See additional file 1: Table S1). Sequences had around 360 nucleotides, corresponding to 42.5% of the gene, from positions 84 to 444 of the sequence (Gi: 167861677).
For An. aquasalis, a higher molecular weight band was detected ( Figure 1) and sequenced, but no significant sequence similarity was detected through BLAST searches.

Sequence analysis
The analysis of the sequences obtained for the neotropical mosquito species showed 79 SNPs in 360 nucleotides, where 11.33%correspond to the first, 7.59%to the second, and 81.01%to the third codon positions. Of the 72 polymorphic codons, 64 correspond to synonymous substitutions and eight are non-synonymous. In total, 61 transitions and 25 transversions were detected. When the An. gambiae complex sequences were included in the analysis, 128 nucleotide polymorphisms and 104 polymorphic codons were observed, resulting in 75 synonymous and 29 non-synonymous substitutions. A high degree of conservation among the amino acid sequences from anopheline mosquitoes has been found ( Figure 2).

Phylogenetic analysis
Phylogenetic relationships among 21 partial fbn9 gene sequences from 10 distinct anopheline species (Table 1) were reconstructed using a Bayesian approach as described above (Methods). The tree topology (Figure 3) suggests the presence of two well-supported sequence groups (pp = 1) corresponding to the African (black) and Neotropical (green) species analyzed in the present study. The former group includes different lineages of five species of the Gambiae complex with two lineages of An. merus (MER562_A and MER563_A) clustered together with other two of An. quadriannulatus (QUA16 and QUA24-B). Other relationships in the African group are also well-resolved considering the posterior probability cutoff (at least 0.80) adopted in our approach.
The phylogenetic relationships among the five neotropical species (green) are strongly supported by the analysis as showed in Figure 3. The two sequences of An. albitarsis (An. marajoara) originated from Amazon and Rondonia, Brazil, are closely related (pp = 1) and form a sister group with sequences of An. aquasalis and An. nuneztovari (pp = 0.98). The An. darlingi sequence seems to be more divergent across the Neotropical species. Further phylogenetic analysis supported the orthology prediction performed through sequence similaritybased searches aforementioned.

Discussion
The unavailability of genomes from neotropical anopheline species hamper most of the studies related to function and evolutionary aspects of important genes in mosquitoes, which makes the use of alternative techniques the only available solution to overcome this problem. Recently, studies on differential gene expression of the coastal malaria vector An. aquasalis, infected with Plasmodium vivax, have been performed and three fibrinogen-related genes have been found, which differ from fbn9 [35].
In An. gambiae, FBN9 has been known to have an important function on the mosquito immune system, acting as a pattern recognition receptor. Its expression is up-regulated when mosquitoes are infected with bacteria and Plasmodium species [11]. The participation of this protein on the mosquito's interaction with parasites is very important because it determines the insect's vectorial competence [36].
This study showed the high conservation pattern present in this protein in both An. gambiae and in neotropical species belonging to the subgenus Nyssorhynchus, suggesting that this protein might have the same function in all these species. As recently stated, fibrinogen related proteins (FREPs), where FBN9 belongs, are part of the basal immune surveillance of mosquitoes by interacting with mosquito bacterial flora [11]. This may explain this conservation as all mosquito species ubiquitously harbour bacteria.
A comparison of synonymous and nonsynonymous substitution rates in protein coding genes provides an important means for understanding molecular evolution [37]. When comparing the sequences in different Anopheline species, a higher frequency of polymorphic sites on the third base of the codon reflects the higher presence of synonymous substitutions. The results presented here are in concordance with previous work analyzing the patterns of molecular evolution in the fbn9 gene using 60 sequences of six species from the Gambiae complex [13]. Here, synonymous substitutions are in higher frequency than non-synonymous substitutions resulting in a dN/dS < 1 ratio, which corresponds to a negative or purifying natural selective pressure, perhaps limiting alterations at protein level. This observation suggests that this region may be important to Recently, Lehmann and colleagues studied four immunity-related genes (SP14D1, GNBP, defensin, and gambicin) in An. gambiae and found no evidence to prove that selection was mediated by pathogens that are transmitted to humans [38].
The indication of conserved function of the fbn9 gene between the Gambiae complex and the Brazilian anophelines is interesting if one considers that the separation of sub-genus Cellia and Nyssorhynchus occurred around 94 my ago, based on mtDNA analysis [39]. Further studies on the expression of fbn9 in Neotropical mosquito species and its role during infection with Plasmodium species will provide further information on this important immune-related gene.

Conclusions
In the present work, the fbn9 gene sequences of four neotropical anopheline species have been compared with their homologs in the An. gambiae complex to gain insights into insect immunity. Sequence analysis shows a high degree of conservation of the fbn9 gene in all species compared and suggests that these sequences are under negative selection pressure. Bayesian phylogenetic analysis supports the hypothesis of neotropical anophelines (subgenus Nyssorhynchus) and African mosquitoes (subgenus Cellia) forming two well-supported clades. The present work suggests a possible conserved role for fbn9 in the immune response of a broader range of anopheline species. Identification of the genes involved in the mosquito immune response to parasite infections shall allow the design of novel pathogen control strategies.   (Table 1). Branches leading to the Brazilian (present study) and African anopheline species are indicated in green and black, respectively. Posterior probability values (pp) are indicated in this 50% majority-rule consensus (unrooted) tree. Letters A and B following the specimen abbreviation indicate two alleles of a single individual specimen as described elsewhere [13].