Comparative physical genome mapping of malaria vectors Anopheles sinensis and Anopheles gambiae

Background Anopheles sinensis is a dominant natural vector of Plasmodium vivax in China, Taiwan, Japan, and Korea. Recent genome sequencing of An. sinensis provides important insights into the genomic basis of vectorial capacity. However, the lack of a physical genome map with chromosome assignment and orientation of sequencing scaffolds hinders comparative analyses with other genomes to infer evolutionary changes relevant to the vector capacity. Results Here, a physical genome map for An. sinensis was constructed by assigning 52 scaffolds onto the chromosomes using fluorescence in situ hybridization (FISH). This chromosome-based genome assembly composes approximately 36% of the total An. sinensis genome. Comparisons of 3955 orthologous genes between An. sinensis and Anopheles gambiae identified 361 conserved synteny blocks and 267 inversions fixed between these two lineages. The rate of gene order reshuffling on the X chromosome is approximately 3.2 times higher than that on the autosomes. Conclusions The physical map will facilitate detailed genomic analysis of An. sinensis and contribute to understanding of the patterns and mechanisms of large-scale genome rearrangements in anopheline mosquitoes. Electronic supplementary material The online version of this article (doi:10.1186/s12936-017-1888-7) contains supplementary material, which is available to authorized users.

FLX sequencing approach with a Chinese laboratory strain and assembled into the 9595 scaffolds spanning 220.8 million base pairs (Mb) [10]. At almost the same time, the complete transcriptome of this species was obtained using the Illumina paired-end sequencing technology, and 38,504 unigenes were identified from another Chinese strain [11]. Later, the genome of a different strain of An. sinensis ('SINENSIS') was sequenced and assembled for comparative analyses by the 16 Anopheles mosquito genome project [12]. However, all these research efforts resulted in large numbers of scaffolds and contigs without chromosome assignment or orientations. The availability of a physical map for An. sinensis with scaffolds and contigs localized on the chromosomes will increase the quality of comparative genomic analyses with other mosquitoes that have chromosome-based genome assemblies, e.g. Anopheles gambiae. Such analyses will allow an exploration of the genomic basis of vectorial capacity and a study of the patterns of chromosome homology and rearrangements between species.
So far, physical maps have been developed for several Anopheles mosquito species including An. gambiae, Anopheles funestus, Anopheles stephensi, Anopheles atroparvus and Anopheles albimanus. These maps improved the draft genome assemblies and helped to understand the genome organization and evolution [13]. Anopheles gambiae and An. funestus represent two major African malaria vectors, while An. stephensi is a dominant vector in Asia. These species belong to the subgenus Cellia within the Series, Pyretophorus (An. gambiae), Myzomyia (An. funestus), and Neocellia (An. stephensi) [12]. Comparisons of the mapped genomes of An. funestus and An. stephensi with the An. gambiae genome have demonstrated that the X (sex) chromosome and the 2R arm are much more prone to rearrangement than the other chromosomal arms [14,15].
Changes in gene order between An. gambiae and other species, including An. atroparvus and An. albimanus, demonstrated that the difference in the rate of evolution between the sex chromosome and autosomes is more than threefold [12]. A recent comparative genomic study between An. gambiae within genus Anopheles and Aedes aegypti in Culicinae also revealed that the sex-determining chromosome has a higher rate of genome rearrangements than autosomes [16]. However, whether fast evolution of the sex chromosome occurs in the majority of anophelines will not be clear until more species are investigated.
This study aimed to construct a physical map for An. sinensis by anchoring scaffold sequences onto the polytene chromosomes and to identify conserved synteny blocks and fixed inversions between An. sinensis and An. gambiae for exploring the patterns of chromosome evolution in Anopheles mosquitoes.

Mosquito strains and chromosome preparation
The Wuxi laboratory strain (Jiangsu Institute of Parasitic Diseases, Wuxi, China) of An. sinensis was used in this study. Polytene chromosome preparations were made using salivary glands dissected from early fourth-instar larvae of An. sinensis as previously described [17]. Chromosomes with clear banding patterns were fixed in liquid nitrogen and dehydrated in 50, 70, 90 and 100% ethanol for in situ hybridization.

Fluorescence in situ hybridization
Genome sequences of the An. sinensis China strain were acquired from the database of Zhou et al. [10]. Polymerase chain reaction (PCR) primers for An. sinensis scaffolds were designed using the Primer3 Program [18]. PCR procedures were performed with genomic DNA of Anopheles lesteri extracted from live fourth-instar larvae with the DNeasy Blood & Tissue Kit (Qiagen GmbH, Hilden, Germany) as templates. After PCR amplification, the PCR products were cut and purified from the agarose gel using a QIAquick Gel Extraction Kit (Qiagen GmbH, Hilden, Germany) and then labelled with either Cy3.5-AP3-dUTP or Cy5.5-AP3-dUTP (GE Healthcare UK Ltd. Chalfont St Giles, UK) using a Random Primed DNA Labelling Kit (Roche Applied Science, Penzberg, Germany). Following the in situ hybridization procedure performed using a previously described method [19], fluorescent signals were detected and recorded with a Zeiss LSM 710 laser scanning microscope (Carl Zeiss Microimaging GmbH, Oberkochen, Germany) and finally mapped to the cytogenetic map of An. sinensis [17].

Gene orthology, syntenic blocks and fixed inversion
OrthoDB was used to identify one-to-one orthologues from An. sinensis and An. gambiae and to determine their locations on the scaffolds [20]. The comparative positions of the orthologous genes from An. sinensis and An. gambiae were plotted using genoPlotR [21]. Synteny blocks for each pair of homologous chromosome arms between An. sinensis and An. gambiae were analysed from the database generated by OrthoDB (Additional file 1). Chromosomal regions containing two or more orthologous genes with the same order and orientations were defined as synteny blocks and numbered 1, 2, 3, etc. along the chromosomes. After obtaining the number of all synteny blocks, the inversion distances on homologous chromosome arms between An. sinensis and An. gambiae were estimated using the programs of Genome Rearrangements in Mouse and Man (GRIMM) [22].

Chromosome evolution in Anopheles mosquitoes
Chromosome evolution rates represented by inversions/ Mb/MY were calculated as inversion number/mapped genome size/divergence time. To compare the evolution rates for each chromosomal arm in different species, previously published data was included for analysis [12]. To explore the fast evolution of sex chromosome, phylogenetic relationships of the 17 anopheline species were considered [12].

A physical genome map of Anopheles sinensis
For physical mapping, scaffold sequences were acquired from the database of Zhou et al. [10]. Two pairs of PCR primers were designed from the start and the end of each scaffold. After amplification, the Cy3-and Cy5-labelled probes were hybridized to the polytene chromosomes of An. sinensis. Two examples of fluorescence in situ hybridization (FISH), with one clear signal in each, are presented in Fig. 1. A total of 104 clones were mapped to the polytene chromosomes of An. sinensis to determine the chromosomal locations of 52 scaffolds. The physical map and scaffold localizations of 52 An. sinensis scaffolds are summarized in Fig. 2 and Table 1, respectively. Of the 52 scaffolds, the orientations of 48 scaffolds could be determined, and four scaffolds have unique chromosome locations. This physical map includes 26 of the 30 largest scaffolds. The largest scaffold, AS2_scf7180000696055, with a size of 5,918,260 bp, was mapped to the regions 38C to 39C of the 3L chromosome and the second largest scaffold, AS2_scf7180000696060 (4,138,565 bp) was localized to the 22C-23B of the 2L arm of An. sinensis (Table 1). Although X is the shortest chromosome, it had the best mapping coverage among the five chromosomal arms, with eight mapped scaffolds from telomere to centromere, representing 13.42 Mb of genome. Chromosome 2R, 2L, 3R and 3L had 12, 8, 14 and 10 scaffolds, respectively ( Table 2). The An. sinensis genome physical map composes 79.32 Mb, or 36%, of the total assembled (220.8 Mb) genome sequences ( Table 2).
The physical map of An. sinensis presented in this study was compared with previous mapping data summarized in Table 3. Among mapped anopheline genomes, An. albiumanus had the most complete chromosomally anchored genome assembly covering 98.2% of the genome, followed by An. gambiae, An. stephensi and An. atroparvus [12,23] (Table 3). Mapping of 52 scaffolds in An. sinensis and 103 scaffolds in An. funestus achieved similar portions of mapped genomes in both species (Table 3). Thus, the new genome map of An. sinensis can be used for exploration of chromosomal evolution in malaria mosquitoes.

Synteny and gene order evolution in An. sinensis and An. gambiae
A total of 3955 one-to-one orthologues were identified from An. sinensis and An. gambiae using OrthoDB [20] (Additional file 1). The comparative positions of genes within mapped scaffolds based on orthology relationships were plotted on An. sinensis and An. gambiae chromosomes using genoPlotR [21] (Fig. 3). Physical mapping data were used to determine the orientations of scaffolds, and the default orientations were assigned to some scaffolds with only one probe. Figure 3 shows that the gene orders were reshuffled on five chromosome arms because of fixed inversions. The gene order changes on the X chromosome were more dramatic than those on the autosomes: 2R, 2L, 3R and 3L. The comparative chromosomal locations and orientations of 3955 orthologous genes were further used to determine the number of synteny blocks in the two species. Synteny blocks were defined as genomic regions containing at least two orthologous genes with the same order and orientation. A total of 364 synteny blocks have been identified between An. sinensis and An. gambiae ( Table 4). The analysis revealed that the average length of 112 synteny blocks on the X chromosome (85,989 bp) is much smaller than those on the remaining chromosomes (237, 175; 239, 627; 197, 751 and 242, 299). Additionally, the largest synteny block on the X arm is only 766,489 bp, whereas the largest block on 2R is 1,796,395 bp, which is twice that on the X arm (Table 4). These results suggest that the sex X chromosome has smaller synteny blocks than the autosomes.
To further analyse fixed inversions between An. sinensis and An. gambiae, we input the order of 361 synteny blocks (Additional file 2) into the Genome Rearrangements in Man and Mouse (GRIMM) program [22]. Table 5 shows that a minimum of 267 inversions were estimated between An. sinensis and An. gambiae. The sex chromosome exhibited a greater number of inversions (101), whereas the autosomes 2R, 2L, 3R and 3L had 42, 51, 33 and 40 inversions, respectively ( Table 5). The total size of mapped scaffolds on each An. sinensis chromosome was used to calculate the density of inversions per megabase. Our data demonstrate that the inversion breaks per megabase on X chromosome is 7.527, which is approximately 3.2 times greater than the average density of inversions on autosomes (2.367) ( Table 5). Among the autosomes, the inversion density between An. sinensis 3R and An. gambiae 2R is 2.997 inversions/ Mb, which is higher than for the remaining autosomes. The 3L chromosome exhibits the lowest density of inversions (2.133 inversions/Mb). The most recent study of the chromosome evolution in Anopheles used the divergence time between An. atroparvus and An. gambiae of 58 MY [12], and An. atroparvus and An. sinensis belong to the   (Table 5).

Rapid evolution of the sex chromosome in Anopheles mosquitoes
To understand the pattern of inversion fixations in malaria mosquitoes, the number of inversions/Mb/MY in our analysis was compared with the earlier published data [12] ( Table 6). The results revealed that inversion rates on autosomes varied between An. gambiae and each of five Anopheles species. However, the density of fixed inversions on the X was consistently greater than that on autosomes (Table 6), suggesting the faster evolution of X chromosome in Anopheles mosquitoes. The ratio of the X chromosome evolution rate to the autosomal rate of rearrangements in An. sinensis and An. gambiae was also calculated and our data demonstrated that the X chromosome evolved approximately 3.2 times faster than autosomes. Our chromosomal evolution analysis data was added into the phylogenetic relationships of the 17 anopheline species constructed by Neafsey et al. [12] using the aligned protein sequences of 1085 single-copy orthologs. Figure 4 shows that the ratio of the X chromosome evolution to the autosomal rate of rearrangements varies among the Anopheles lineages with being higher in subgenera Anopheles and Nyssorhynchus and lower in genus Cellia.

A physical map is a critical tool for improving a genome assembly and for studying chromosomal evolution
In this study, a physical map was constructed for an Asian malaria vector An. sinensis using fluorescence in situ hybridization (FISH) of DNA probes with polytene chromosomes. The physical mapping of An. sinensis placed 52 large scaffolds with total length of 79,322,722 bp from the genome database to the chromosomes ( Fig. 2; Table 2). It accounted for approximately 36% of the total assembled (220.8 Mb) genome sequences of An. sinensis (Table 2). So far, several genome maps have been developed for malaria mosquitoes and we compared the percentage of the physically mapped genome in An. sinensis with data from other species [12,23]. Among mosquitoes, the African malaria vector An. gambiae was the first to have its genome sequenced [24]. More than 2000 BAC clones     were originally placed onto the chromosomes for genome mapping and later, additional mapping added small scaffolds to the area around the centromeres, which resulted in ~84.3% of the An. gambiae genome assembly [25]. The physical map of An. albimanus initially placed ~76% of genome onto the chromosomes [12], while a more recent physical mapping effort reached the 98.2% coverage of the An. albimanus genome assembly [26], which is the most complete genome assembly to date. The genome of An. stephensi, a key vector of malaria throughout the Indian subcontinent and Middle East, has also been sequenced   . 4 Reconstructed phylogenetic relationships of the 17 anopheline species and chromosomal evolution analysis from Ref. [12]. The aligned protein sequences of 1085 single-copy orthologs were used to construct the maximum likelihood molecular phylogeny. Chromosome evolution analysis was conducted between the species indicated with a dark font and An. gambiae. Comparative physical mapping has not been performed for the species marked with a grey font. Ma represents million years ago. The number in brackets after the divergence time is the ratio of the X chromosome evolution rate to the autosomal rate of rearrangements in each species compared with An. gambiae and assembled. A total of 86 scaffolds were in situ hybridized to the polytene chromosomes of An. stephensi, representing 62% of the genome assembly [23]. Anopheles atroparvus and An. funestus had mapped portions covering 39.6 and 35.1% of the total genome, respectively [12]. In this research, our new physical map for An. sinensis covers 35.9% of the genome, which is within the range of other Anopheles species (Table 3).

Fast evolution of the sex chromosome in Anopheles mosquitoes
The availability of the genome sequences and physical maps for Anopheles mosquitoes have promoted detailed analysis of the patterns of fixed inversions [12,13]. In our study, 361 conserved synteny blocks and 267 fixed inversions were identified between An. sinensis and An. gambiae. Analysis of the density of inversions per Mb and the rate of chromosomal rearrangements in An. sinensis and An. gambiae suggested that fast evolution occurs on the sex chromosome. The earliest study of inversions on closely related species of the An. gambiae complex revealed that 5 of 10 inversions were on the X chromosome, providing the first evidence of fast evolution of sex chromosomes in Anopheles mosquitoes [27]. Several species belonging to different series within the subgenus Cellia have been extensively studied: An. gambiae (Pyretophorus), An. stephensi (Neocellia) and An. funestus (Myzomyia) [12]. The comparative analysis between An. funestus and An. gambiae as well as between An. stephensi and An. gambiae [14,15] further demonstrated that the X chromosome evolved faster than the autosomes. The most recent analyses based on the genome assembly confirmed that the rate of evolution on X is approximately 2.2 times faster than the average autosomal rate for An. funestus and An. gambiae [12] or 2.94 times faster for An. stephensi and An. gambiae [23] ( Fig. 4). Anopheles sinensis and An. atroparvus are members of the subgenus Anopheles, which is thought to have diverged from An. gambiae 58 MY ago [10,12]. Previous studies have shown that the difference in the rate of evolution between the sex chromosome and autosomes is approximately 3.65 times in An. atroparvus and An. gambiae [12]. In this study, the density of inversions on the X chromosome is found to be 3.2 times greater than the average density of inversions on the autosomes between An. sinensis and An. gambiae (Fig. 4). These results suggest that the rapid evolution of sex chromosome is a common feature in Anopheles mosquitoes. The X chromosome rearrangements may play a role in speciation of malaria mosquitoes [14,28]. Future genome studies can provide valuable information for dissecting the role of X chromosome inversions in speciation of malaria vectors.

Conclusions
This study constructed a physical genome map for an important malaria vector of P. vivax, An. sinensis, which is the most widely distributed vector in China, Korea, and Japan. This physical map includes 52 of the largest scaffolds from An. sinensis, spanning approximately 80 Mb of the 220 Mb, or approximately 36%, of the sequenced genome. The map coverage is similar to the mapped portion of An. funestus and An. atroparvus. By analysing the comparative positions of 3955 orthologous genes, 361 conserved synteny blocks and 267 fixed inversions between An. sinensis and An. gambiae were identified. The rate of evolution of the sex chromosome is approximately 3.2 times greater than the average autosomal rate of evolution. Thus, our comparative analysis in An. sinensis and An. gambiae inferred from physically mapped genome assemblies provided additional details for understanding chromosome evolution in malaria vectors.