Analysis of pir gene expression across the Plasmodium life cycle

Background Plasmodium interspersed repeat (pir) is the largest multigene family in the genomes of most Plasmodium species. A variety of functions for the PIR proteins which they encode have been proposed, including antigenic variation, immune evasion, sequestration and rosetting. However, direct evidence for these is lacking. The repetitive nature of the family has made it difficult to determine function experimentally. However, there has been some success in using gene expression studies to suggest roles for some members in virulence and chronic infection. Methods Here pir gene expression was examined across the life cycle of Plasmodium berghei using publicly available RNAseq data-sets, and at high resolution in the intraerythrocytic development cycle using new data from Plasmodium chabaudi. Results Expression of pir genes is greatest in stages of the parasite which invade and reside in red blood cells. The marked exception is that liver merozoites and male gametocytes produce a very large number of pir gene transcripts, notably compared to female gametocytes, which produce relatively few. Within the asexual blood stages different subfamilies peak at different times, suggesting further functional distinctions. Representing a subfamily of its own, the highly conserved ancestral pir gene warrants further investigation due to its potential tractability for functional investigation. It is highly transcribed in multiple life cycle stages and across most studied Plasmodium species and thus is likely to play an important role in parasite biology. Conclusions The identification of distinct expression patterns for different pir genes and subfamilies is likely to provide a basis for the design of future experiments to uncover their function. Supplementary Information The online version contains supplementary material available at 10.1186/s12936-021-03979-6.


Introduction
The genomes of the malaria parasites (Plasmodium spp.) contain a variety of multigene families. These are generally located in the sub-telomeric regions of chromosomes, a feature which is thought to allow regulation of gene expression by heterochromatin and promote diversification of the gene sequences through recombination [1][2][3]. The most highly studied of these families is var, which encodes the PfEMP1 protein. It is present in ~70 copies in the human malaria parasite Plasmodium falciparum. The primary role of PfEMP1 appears to be sequestration of the parasite in the vasculature, preventing its destruction in the spleen [4]. Multiple copies of var seem to be required to alter the binding specificity and for immune evasion -i.e. switching between antigenically distinct proteins with similar binding properties [5]. This family is however limited to the Laverania subgenus, which includes only one of the five species which infect humans. The Plasmodium interspersed repeat (pir) multigene family has been found in the genomes of rodent malarias, primate malarias and the human-infecting Plasmodium species, P. vivax, P. knowlesi, P. malariae and P. ovale [6][7][8][9]. The number of pir genes in the genomes of these species varies considerably, from the lower numbers of 211 members in P. chabaudi chabaudi AS and 134 members in P. berghei, to more than 1,000 members in P. yoelii and up to 1,949 members in P. ovale curtisi [9,10]. The pir family is therefore considered to be the largest Plasmodium multigene family.
Although it has been suggested that the pir family fulfils similar roles in immune evasion and pathogenesis as P. falciparum var, there is no direct evidence for this, and the function(s) of pirs remains largely unknown. While P. falciparum parasites can only express a single var gene at a time, individual P. berghei, P.vivax and P. yoelii parasites can express multiple pir genes [11][12][13][14][15]. Although some proteomic and immunofluorescence studies show PIR proteins on the surface of infected RBCs (iRBCs), other studies indicate that they are present in the host or parasite cytoplasm, or on the parasitophorous vacuole [6,[16][17][18][19] suggesting multiple different functions for PIR proteins during blood-stage infections. In Plasmodium chabaudi chabaudi AS infections of mice, different pir subfamilies are associated with the acute and chronic phases of the infection, and parasites from the two phases of infection are differently virulent [20]. This association suggests pir gene expression may affect virulence of blood-stage P. chabaudi, and also that pir genes could be involved in evading the initial immune response. Single-cell RNA-seq analysis across the P. berghei life cycle has shown that pir gene expression is particularly high in blood stages [14]. Male gametocytes were found to express a distinct pir repertoire compared to asexual blood stages, with female gametocytes expressing few pir transcripts [13]. Although these data represent only the most highly expressed genes, they suggest there may be different functions for pir genes between asexual and sexual blood stages. This family has been very difficult to study in the laboratory due to there being many genes with varying levels of similarity expressed at a variety of times. Identifying which are expressed at particular life-cycle stages will allow a more targeted approach to determining function. Here we use two rodent-infecting Plasmodium species to investigate the stagespecificity of pir gene expression. We performed a systematic analysis of the P. berghei ANKA pir multigene family from published bulk RNA-seq studies, re-processing the raw data using a single bioinformatics pipeline. While single-cell RNA-seq datasets have provided excellent resolution of gene expression in time across the life cycle [14], bulk RNA-seq datasets provide much higher resolution of the transcriptome itself, detecting a larger proportion of transcripts expressed at any one time. Indeed, expression of relatively few pir genes was detected in the Malaria Cell Atlas [14]. To examine the dynamics of pir expression in asexual blood stages in more detail, we also generated high-resolution transcriptional data from P. c. chabaudi AS, across the asexual blood cycle.
Our study demonstrates that the primary function of the pir multigene family likely plays out in the blood stages, from merozoite formation in the liver to fertilisation in the mosquito midgut. It is unlikely to be important during the post-fertilisation stages of development in the mosquito through to early liver stages. In both rodent malaria models the pir transcriptional repertoire is diverse throughout the intraerythrocytic developmental cycle, however different subfamilies are differently represented over time, suggesting subtle differences in regulation and perhaps function. One pir gene, previously described as the putative ancestral pir [21], is clearly notable as forming the highest proportion of the pir transcriptome of these two species, and of multiple different species of Plasmodium for which transcription data is available. This gene may prove a more experimentally tractable target due to its uniqueness and warrants further study.

Parasites
A cryopreserved stock of a cloned line of Plasmodium chabaudi chabaudi AS, originally obtained from David Walliker, University of Edinburgh, UK, and subsequently passaged through mice by injection of infected red blood cells (iRBC), was used to initiate infections and mosquito transmissions. Transmission of P. chabaudi via Anopheles stephensi mosquitoes has been described in detail previously [22]. Recently mosquito-transmitted (RMT) P. chabaudi derived from a mosquito-initiated infection retain the same phenotype of infection course and gene expression profiles of directly mosquito-transmitted P.
chabaudi infections [20,23]. RMT parasites are used here to ensure that each mouse receives a consistent inoculum of blood-stage parasites, rather than the temporarily more variable appearance of parasites in the blood from either a MT-or sporozoite-initiated infection [23]. For RMT-blood stage infections, mice were infected by intraperitoneal injection of 10 5 iRBC. Blood samples were collected at three-hour intervals over one 24hr asexual cycle. Infections were monitored by light microscopy on Giemsa-stained thin blood smears. Parasitaemia across the 24hrs ranged from 2-10% (Supplementary Information 2).
For the P. berghei datasets the liver stages, and some of the asexual blood stages (rings, trophozoites and schizonts), were defined by time of development in culture. In some published data sets, male and female gametocytes were analysed separately, while in others gametocytes were mixed. The 6h Liver samples from [34] were removed due to low read counts, also removed by the authors in the original publication. Additionally, a 2h Liver sample (SRR11142819) from [35], and two schizont samples (SRR3437888 and SRR3437912) from [36] were removed because hierarchical clustering showed that they were dissimilar to the other samples. The annotations of pir genes in each species were taken from PlasmoDB v48 and can be found on the first sheet of Supplementary Information 2 (P. berghei) and the second sheet of Supplementary Information 6 (P. chabaudi). Note that this version of PlasmoDB annotated the orthologous genes PBANKA_0524600 and PCHAS_0524800 as members of the pir family, but we believe this to be erroneous due to their two-exon structure and lack of similarity with other pir genes. Hence, they were excluded from analysis here. The ChAPL/AAPL information for P. chabaudi was taken from Brugat et al., 2017 [20].

Deconvolution of P. chabaudi bulk transcriptomes
To determine the relative proportions of different life stages in the P. chabaudi transcriptomes, we used the approach described in Aunin et al. [52]. Briefly, we used pseudobulk samples (excluding mosquito and liver stages) derived from the Malaria Cell Atlas [14] as a reference to deconvolute with CIBERSORT v1.06 using default settings [53].  [63].

Ancestral pir gene transcriptional analysis
We then investigated whether the orthologs of the ancestral pir gene were also highly transcribed in other Plasmodium species for which transcription data is available.
TPM/RPKM from published studies ( Table 2) was used to calculate the proportion of ancestral pir expression relative to the rest of the transcriptome.

Data availability
The RNA-seq data relating to P. c. chabaudi AS intraerythrocytic developmental cycle are available from the ENA (ERP002273). The relationship between individual samples and ENA accessions is described in Supplementary Table 1.

Asexual blood stages, liver merozoites and male gametocytes are the foci of pir gene expression in Plasmodium berghei.
The rodent pir gene family is divided into two groups named S and L based on sequence similarity and average gene length [12]. These are further classified into subfamilies (L1-L4; S1-S8), with some being unique to a particular rodent species (e.g. S7, P. chabaudi; S8, P. berghei). In order to determine whether an association exists between transcription of any pir subfamily and a particular life cycle stage, we performed a systematic analysis of published transcriptome studies from the rodent Plasmodium species, P. berghei ANKA ( Figure 1, Table 1 (PBANKA_0514900) [64] and nek4 (PBANKA_0616700) (female gametocytes and ookinetes) [65], mapk2 (PBANKA_0933700) (male gametocytes) [66], and hap2 (PBANKA_1212600) (gametes) [67]. Their transcriptional activity confirmed that the asexual blood stages had few gametocyte contaminants.
The genome of P. berghei ANKA contains 134 pir genes. Although multiple pir genes are transcribed at all stages of the life cycle (Figure 2A), this begins at very low levels in salivary gland sporozoites and early liver stages, and is then followed by an increase in both the level of transcript, and the total number of pir genes transcribed, in the later liver stages.
One study analysed RNAseq data from liver merozoites [34], and we find that they display a dramatic increase in the levels of pir gene transcripts over earlier liver stages. High levels of transcript are maintained in asexual blood stages declining somewhat by the schizont stage. There are relatively few pir transcripts in female gametocytes, whereas the level of pir transcription in male gametocytes is of a similar or greater magnitude and breadth to that observed in asexual blood stages. Ookinetes within the mosquito midgut transcribe fewer pirs at lower transcription levels, similar to late liver stages, and female gametocytes.
There is one subfamily of pirs, distinct from S and L, containing only a single member. It has syntenic orthologues in the genomes of rodent and primate-infecting Plasmodium species.
This pir and its orthologs have been described as the ancestral pir, because the other subfamilies may have derived from it [20]. The P. berghei ancestral pir orthologue (PBANKA_0100500) is consistently highly transcribed from late liver stages, through asexual intraerythrocytic development and in blood stage gametocytes (shown in yellow in Figure 2A and Although transcription of individual pir genes tended not to be stage-specific, there were pirs clearly enriched in certain stages of the life-cycle (Supplementary Information 4).
These included the L1 pirs PBANKA_0317181 in male gametocytes (compared to mixed asexual stages: log fold change 5.09, FDR < 5x10 -9 ; compared to liver merozoites: log fold change 10.65, FDR < 5x10 -7 ) and PBANKA_0600031 in liver merozoites (compared to mixed asexual stages: log fold change 5.69, FDR < 5x10 -7 ; compared to liver merozoites: log fold change 6.82, FDR < 5x10 -6 ). The pir gene PBANKA_0400500 was highly transcribed in two of the three experiments performed for the later liver stages (post.48h). This gene is not restricted to liver-stage expression, although notably high in the liver stages, suggesting that it may play a role in late exo-erythrocytic and early asexual stage parasites.

Differential timing of expression of L and S pir gene subfamilies during the blood cycle of P. chabaudi
We have shown that pir gene expression levels in P. berghei are low in mosquito stages, but high in the mouse from late liver to asexual blood stages and male gametocytes. We wanted to explore the expression of rodent pir genes at higher resolution across the asexual blood stages to determine whether there were more subtle differences in expression of the different subfamilies. To do this, we generated transcriptional data from another rodent malaria parasite, P. chabaudi chabaudi AS ( Supplementary Information 5). This parasite has a largely overlapping repertoire of pir genes. We have a good understanding of pir gene expression at 14h post infection and are able to transmit it by mosquito. We have shown that compared to serial blood passage this results in expression of a wide repertoire of pir genes which we expect is more representative of the situation in the wild [20,23].  Figure 4B).
Pir genes show a cyclical expression pattern across the developmental cycle, with the majority upregulated at 8-14h (trophozoite stage) ( Figure 3B; Supplementary Information   6). This signal suggests that pir genes may be required in the schizonts and merozoites, as the transcriptional signal is expected to precede translation of protein by several hours (22). Pirs upregulated at ring stages were predominantly of the L1 subfamily, which have a slightly earlier transcriptional peak (8-11h) in the developing trophozoites and are quiescent only in the early ring stages. Transcription of both the S7 and S1 subfamilies peaks sharply at 14h. L4s notably exhibit two distinct temporal pir transcription profiles, with a proportion being similar to L1s and the remainder being like the short S pirs where ( Figure 3B). The ancestral pir gene, PCHAS_0101200, is transcribed throughout the intraerythrocytic developmental cycle, but still peaks during the trophozoite stages

The putative ancestral pir gene is a distinctive target for functional studies
We have shown above that the P. berghei ancestral pir orthologue, PBANKA_0100500, is the single most highly transcribed pir gene in this species. It contributes a high proportion of the total pir steady-state transcriptome in late liver, asexual blood and gametocyte stages.
Transcription is only diminished in stages where overall transcription of this multigene family is low i.e. liver stages before 36h of development, ookinetes and sporozoite stages.
Similarly, in P. chabaudi AS, the ancestral pir orthologue (PCHAS_0101200) is highly transcribed across the complete asexual developmental cycle (Supplementary Figure 5).
Extending earlier findings, we found that there is a single, syntenic ortholog in all species whose genomes contain canonical pirs [10,20]( Figure 4A). The sequences are highly conserved between species, with multiple blocks of 90-100% similarity ( Figure 4B). The gene is within the top 50% of expressed genes in all species for which we could find RNAseq data ( Figure 5).

Discussion
We found that high numbers of pir gene transcripts were expressed from a variety of family members in stages which produce the invasive blood forms (merozoites) and in male gametocytes. Conversely, only low levels were expressed throughout the mosquito stages, early vertebrate stages and in female gametocytes. This strongly indicates that this multigene family is not required for development within the mosquito beyond sexual reproduction. Instead, it is likely that the bulk of this large family, at least in rodent malaria parasites, is involved in blood stages, leading from red cell invasion by first generation (liver) merozoites to fertilisation and generation of the ookinete form rapidly after mosquito blood feeding. We have previously shown the high levels of pir gene expression in P.
chabaudi parasites in asexual blood stages [22,23] and the specificity of pir gene expression in male P. berghei gametocytes versus female gametocytes [13]. This new analysis looks broadly and deeply, confirming the findings based on much lower coverage Malaria Cell Atlas data [14].
We found that L pir genes are highly expressed in late liver stages, merozoites and schizonts.
These would require some time before being translated into functional proteins and it may be that they function in rings, e.g. soon after red cell invasion. Expression of S-type pirs is more steady throughout the development cycle. L-type pirs are first transcribed in mature liver schizonts, and upregulated again in asexual blood schizonts, so they are potentially involved in merozoite formation/function. When we look in more detail at this part of the lifecycle in P. chabaudi we see dynamic expression of each individual pir gene, as has been shown for genes across the Plasmodium genome [68]. Again, L and S types behave differently, suggesting differential regulation and function. Underlying this pattern is the relatively early and broad expression of the ChAPL loci, which are rich in L1 and a subset of L4 pirs, so they may be required throughout the cycle. The S-rich AAPLs peak later and more sharply, and may only be needed for one stage of the asexual cycle. We previously showed the ChAPLs to be associated with chronic infections. We think that populations of parasites expressing ChAPLs survive the acute immune response, while those expressing AAPL loci are killed [20]. Whether this relates to a function in sequestration, evasion of adaptive immunity or a completely different function is unclear. Molecular mimicry as a means of immune suppression has been described for P. falciparum RIFINs, another sequence-variable multigene family, some of which mimic the natural ligand for LILRB1 [69] and suppress the activity of NK cells. P. knowlesi pir sequences have also been shown to have a striking resemblance to 50% of CD99, a T-cell regulating protein, suggesting molecular mimicry and potential immune-modulating activity [8]. Such interactions could promote chronicity, although the mechanisms of the contribution of AAPLs and ChAPLs in infection outcome have yet to be elucidated.
The other focus of pir gene expression is in male gametocytes, with a relative absence of expression in females. A small number of L1 pir genes are highly expressed at this stage in P. berghei, but S1s and S4s predominate. Could this represent a continuation of function from asexual to sexual parasites? Male gametocytes are found in the bloodstream, whereas immature female gametocytes reside in bone marrow, primarily in the extravascular space.
Here, cellular rigidity may be more important than the receptor-mediated interactions of sequestration within the vasculature [70][71][72].
Our analyses indicate that expression of multiple pirs occurs in the parasite developmental stages which are predominantly circulating in the blood, where the parasite-infected cells are targeted by the host's immune system. PIR proteins have been demonstrated to be targeted by antibodies [6] and several studies have shown localisation at or near the parasite surface [16,19,73]. Recent structural studies together with sequence analysis and modelling have revealed hydrophobic conserved disulphide bonds forming cysteine residues suggesting that part of the PIR protein may be extracellular [34]. It is hypothesised that hydrophobic domains are exposed on the surface via a flexible linker. The diversity in sequence and length of the flexible loops are consistent with a surface location, and thus could interact directly with the host immune system or be involved in sequestration.
PIRs localise to different cellular compartments depending on the particular stage of asexual development e.g RBC cytoplasm and parasitophorous vacuole [19], [74]. This raises the possibility that PIR proteins are likely to have functions in addition to immune evasion or immune-suppression in the mammalian blood stream. Our transcriptional analyses do not shed light on this. However, we see an association between higher pir transcription and the stages involved in or just preceding proliferative steps of development, such as liver schizonts, asexual blood stages (specifically the trophozoites in P. c. chabaudi) and male gametocytes, which could be indicative of a role in phase transition rather than overall heightened transcription in these stages.
Although the other major proliferation stage of the parasite, the oocyst in the mosquito, is not covered in the P. berghei bulk RNAseq, single cell RNAseq data suggest that pir transcription is low in more mature oocysts. However, the earlier oocysts may be high transcribers. Bulk RNAseq analysis of oocysts from rodent Plasmodium species at different times post-ookinete differentiation would provide crucial data to investigate this link.
The high level of transcription of one, conserved pir gene, across most Plasmodium parasites suggests that it may serve an important function in the parasite, again in the blood cycle.
Transcription is upregulated just prior to entry into the blood stream and is maintained throughout the erythrocytic developmental stages. The P. cynomolgi RNAseq data demonstrated that transcription of its ortholog was much lower than for most of the transcriptome, however this data is from liver stages in which pir transcription may be low.
Unlike most of the other pir family members vector transmission has little impact on the transcription levels of this gene [75] [36]. Although its distinctiveness suggests that it may fulfil a different role from other pirs, this gene may prove a more tractable target for future studies. We believe that it will provide insights into the molecular function of the whole family.

Conclusion
The pir gene family is the largest found in malaria parasites, with potentially important roles in virulence and chronic infection. We have characterised the landscape of pir gene expression across the Plasmodium berghei lifecycle, highlighting the blood stages as the focus of activity. In depth analysis of the blood stages using the close relative P. chabaudi highlighted subtle differences in the timing of expression of different pir gene subfamilies.
However, the most distinctive expression pattern we found was for the putative ancestral pir gene, conserved across much of the Plasmodium genus, and very widely and highly expressed. The distinctiveness of this pir gene may make discovering its function more tractable while still shedding light on those genes already considered to be involved in hostparasite interactions.

Consent for publication
All authors have seen and approved the content.

Availability of data and materials
The RNA-seq data relating to P.     Table 1.     [85], and six clusters are highlighted in boxes. Six clusters were chosen as most optimal with the 'elbow method' [86], as implemented through the factoextra package function 'fviz_nbclust' using k-means clustering and within cluster sums of squares. The Experiment codes are listed in Table 1 (PBANKA_0514900) and nek4 (PBANKA_0616700), markers for female gametocytes and ookinetes; mapk2 (PBANKA_0933700), marker for male gametocytes [66]; hap2 (PBANKA_1212600), gamete fusion protein [67]. Bar height corresponds to median TPM, with error bars showing the range of TPM values across replicates. Colours correspond to different experiments.

Supplementary Figure 3
Representative Giemsa stained smears of P chabaudi infected iRBCs throughout the 24 hr developmental cycle. Scale bar indicates 10 micrometres.

Supplementary Figure 4
A: Transcriptional deconvolution of each sample from every time point using scRNAseq data [14]. B: PCA plot of the genome TPM data coloured by the different timepoints. PCA was conducted using the prcomp function in R.

Supplementary Figure 5
Bar chart of the transcription of the ancestral pir gene (PCHAS_0101200) across the P. c. chabaudi AS asexual blood cycle. Each point represents one replicate and bars show the median TPM.

Supplementary Table 1.
The relationship between P. chabaudi RNA-seq samples used in this study and their entries in the ENA is described. Figure 1 Systematic study of P. berghei RNAseq datasets that covers its entire life cycle and demonstrates the dynamic expression of pir genes throughout. Schematic of the published experiments used in this study and the P. berghei speci c developmental stages during which samples were collected. The experiment codes (E01 etc.) correspond to the references in Table 1.    Phylogenetic analysis of the ancestral pir gene. Cladogram of the 15 orthologs, with heatmap of expression of the gene relative to the rest of the genome, where data is available (from multiple transcriptomic datasets as RPKM or TPM -see