- Open Access
PEST sequences in the malaria parasite Plasmodium falciparum: a genomic study
Malaria Journal volume 2, Article number: 16 (2003)
Inhibitors of the protease calpain are known to have selectively toxic effects on Plasmodium falciparum. The enzyme has a natural inhibitor calpastatin and in eukaryotes is responsible for turnover of proteins containing short sequences enriched in certain amino acids (PEST sequences). The genome of P. falciparum was searched for this protease, its natural inhibitor and putative substrates.
The publicly available P. falciparum genome was found to have too many errors to permit reliable analysis. An earlier annotation of chromosome 2 was instead examined. PEST scores were determined for all annotated proteins. The published genome was searched for calpain and calpastatin homologs.
Typical PEST sequences were found in 13% of the proteins on chromosome 2, including a surprising number of cell-surface proteins. The annotated calpain gene has a non-biological "intron" that appears to have been created to avoid an unrecognized frameshift. Only the catalytic domain has significant similarity with the vertebrate calpains. No calpastatin homologs were found in the published annotation.
A calpain gene is present in the genome and many putative substrates of this enzyme have been found. Calpastatin homologs may be found once the re-annotation is completed. Given the selective toxicity of calpain inhibitors, this enzyme may be worth exploring further as a potential drug target.
Calpain (EC 18.104.22.168) is a Ca2+-dependent cysteine protease first isolated in 1978, with a pH optimum between 7.0 and 8.0. There are at least 15 distinct calpain genes present in the human genome and several have a number of isoforms (up to 10). Along with the ATP-dependent proteasome, calpain appears to be responsible for the majority of non-lysosomal targeted proteolysis. It is a member of the papain superfamily  a group of proteases that includes papain, calpain, streptopain, ubiquitin-specific peptidases and many families of viral cysteine endopeptidases. Calpain is a protein of ancient origin with homologues found in vertebrates, insects, crustaceans, nematodes, fungi, higher plants, Dictyostelium, kinetoplastid Protozoa, and bacteria  and evolved from a gene fusion event between an N-terminal cysteine protease and a C-terminal calmodulin-like protein, an event predating the eukaryote/prokaryote divergence .
The enzyme cleaves preferentially on the C-terminal side of tyrosine, methionine or arginine, preceded by leucine or valine (i.e. P1 = Y, M, or R; P2 = L or V according to the established nomenclature ). Calpain occurs either as a heterodimer with a small regulatory subunit and a large catalytic subunit or as the catalytic subunit alone . It has been crystallised and its structure has been solved for several species [6, 7]. The active site consists of a conserved triad of cysteine, asparagine and histidine. The catalytic domain is divided into two subdomains (2a and 2b) with the cysteine residue lying in domain 2a and the histidine and asparagine in 2b. Calpain has a natural monomeric protein inhibitor, calpastatin . In the presence of Ca2+, calpain undergoes a conformational change, dissociates from or cleaves the associated calpastatin and finally cleaves its own first domain to become fully active.
Substrates of this enzyme appear to be recognised principally by the presence of PEST sequence(s) within the protein [9, 10] although exceptions are known . PEST sequences were first described in 1986  and are short subsequences (usually 10 – 60 residues) within proteins that are bounded by but do not contain basic residues (H, K or R), and are enriched in proline (P), glutamate (E), serine (S), threonine (T) and aspartate (D) residues. An algorithm (the PEST-find score) has been described for assessing the significance of such subsequences: a score of 5 or greater is regarded as significant. PEST sequences are found in ~10% of all cellular proteins in the organisms analysed to date and are typically found in highly regulated proteins. PEST +ve (PEST sequence containing) proteins typically have short half lives (0.5 to 2 hours) in intact cells compared with most other proteins (>24 hours). In PEST +ve proteins, removal or disruption of the PEST sequence increases the protein's half life to more "normal" values while insertion or creation of a new PEST sequence within a PEST -ve (PEST sequence free) protein decreases that protein's half life to a value typical of a PEST +ve protein.
Two papers describe the effects of calpain inhibitors on P. falciparum. The first  described the effect of calpain inhibitors on the invasion of erythrocytes. The authors found the inhibitors used were ~100 times as potent (IC50 ~10-7 M) than the other protease inhibitors (chymostatin, leupeptin, pepstatin A and bestatin) examined. Erythrocytes normally contain only calpain 2 and it was not clear at the time if the effect of these inhibitors was as a result of inhibition of the parasite's and/or of the erythrocyte's calpain. This has been clarified recently by Hanspal et al.  who reinvestigated this effect in calpain 2 knock-out mice. The mouse erythrocytes were shown to have no detectable calpain activity but still supported the invasion and growth of P. falciparum in culture. Calpain inhibition again prevented re-invasion. A third paper  has shown that removal of Ca2+ from the growth medium results in growth arrest in the late trophozoite stage and failure to invade erythrocytes – findings consistent with a role for calpain in the parasite life cycle. With the recent publication of the entire genome [16–18] a search for the calpain and calpastatin genes or their homologs and PEST +ve proteins was undertaken to investigate this further.
The flat files (version 2) were downloaded from the PlasmoDB  web site http://www.plasmodb.org/. The gene coordinates extracted and then used to build a database of the genes. Multiple errors were found in the annotation including an anomalous start (AAA, CAC, GTA, TAG) and termination codons (AAT, ATA, AAG, CTT, GAA, GGG, TTC, TTT), introns of length 2 and 5 bases, unusual intronic splice sites (TA, TT), introns with exceptionally high GC content (up to 44%) and the absence of a number experimentally known genes. In view of these findings, the alternative annotation of chromosome 2 , as revised in September 2002 http://www.wehi.edu.au/MalDB-www/chr2list.html/, was used.
The PEST score was calculated with the standard algorithm . Sequences of 10+ residues bounded by, but not containing basic residues (H, K or R), are first identified. The mole percent (MP) of this subsequence is then determined after subtracting one mole equivalent of P, E/D and S/T. The normalised hydrophobicity value is the value of the Kyte-Doolittle index  for that residue multiplied by 10 plus 45, giving values between 0 and 90. Stellwagen http://emb1.bcc.univie.ac.at/embnet/tools/bio/PESTfind/ has suggested that a value of 58 for tyrosine rather than 32 as originally given gives a more reliable PEST score: the former value was used here. The average hydrophobicity (Ho) of a subsequence is determined by summing the MP of each residue and its normalised hydrophobicity value. The PEST-find score is 0.55 (MP) - 0.5 (Ho). (The original paper has a misprint with PEST-find = 0.5 (MP) - 0.5(Ho))
The revised annotation of chromosome 2  predicts 206 protein-encoding genes. Forty-four PEST sequences with scores > 5 were found with lengths varying from 12 to 94 (30.3 +/-19.9) amino acids in 27 (13.1%) proteins. The proteins fall into four groups (a) hypothetical proteins – 4 (b) DNA binding proteins – 2 (origin recognition complex subunit 5 and chromatin-binding protein) (c) metabolic proteins – 3 (a phosphatase, ribosome releasing factor, ATP-dependent acyl-CoA synthetase) and (d) cell surface associated proteins – 18 (erythrocyte membrane proteins (EMP) 1 and 3, rifins, merozoite surface proteins (MSP) 2 and 5, serine-repeat antigens, transmission blocking target antigen pfs230 and two predicted secreted proteins). The PEST sequences occur throughout proteins with some bias towards the N-terminal end (60% are found in the first half). The PEST +ve proteins had significantly lower predicted pIs (6.63 and 8.13 respectively: t = 3.86, p < 0.0005) and were significantly longer (1181 and 731 amino acids respectively: t = 2.91, p < 0.004) than the average. 18 (66.7%) of the PEST +ve proteins have introns, a figure slightly higher than the mean (57.5%). There was no significant difference in the number of introns per protein (1.12 and 1.32, t = -0.16, p > 0.8).
On chromosome 13, a putative calpain gene (MAL13P1.310) containing a single intron was found. This gene has been discussed in a paper by Wu et al. . The gene is unusually large (2047 residues) and has a biologically implausible "intron" that appears to have been created to avoid an unrecognised frameshift [see Additional file: 1]. The 5' end of the gene contains a low-complexity region and is more than twice the size of other known calpains (generally 600 – 800 residues). The catalytic domain is the only part of the enzyme with homology with the vertebrate enzymes and in the P. falciparum gene this domain is unusually distant from the N-terminus (residues 1002–1470): in other organisms the active site lies within 150 residues of the N-terminus. It seems probable that a fusion event has occurred at the original 5' end of the calpain gene with a second, to date, unidentified gene. If activation of the P. falciparum calpain is similar to that in other organisms, the 5' domain would be removed during activation and this new element may be responsible for the selective toxicity of the inhibitors or may play some regulatory role.
The errors in the published genome were unexpected. A re-annotation of the genome will shortly be completed and it is intended that it will be possible to compare the two annotations (Huestis R., personal communication).
PEST sequences have not been previously described in P. falciparum and the sequences found here appear to be very similar to those known in other organisms. The presence of PEST sequences in hypothetical proteins, DNA-binding proteins  and proteins involved in intermediary metabolism was expected, while the finding that the majority of PEST +ve sequences were surface exposed proteins was not. The greater length and the lower predicted pI of the PEST +ve proteins and locational bias of the PEST sequences towards the N-terminus are consistent with earlier findings .
Several families of surface exposed proteins are present in chromosome 2: PEST sequences were found in all Pf EMPs, MSP 2 and 5, a subset of SERA antigens (6 of 8) and rifins (2 of 7). Cytoadherence-linked asexual genes (CLAGs) and sub-telomeric variable open-reading frames (STEVORs) were all PEST -ve. The presence of PEST sequences in a subset of SERA and rifins is suggestive of differential processing or cellular turnover and may shed some light on the reason for the large number of these genes in the genome.
The presence of PEST sequences in surface exposed proteins prompted a search for these sequences in other proteins known to be involved in merozoite invasion. MSP-1 and -2, the SERAs, and erythrocyte binding antigen (EBA)-175 are PEST +ve while apical membrane antigen (AMA)-1, rhoptry associated protein (RAP)-1, -2 and -3, MAEBL, merozoite capping protein-1 and acidic/basic repeat antigen (ABRA) are PEST -ve. Spectrin, band 4.1 and ankyrin are PEST +ve erythrocyte proteins known to be bound by P. faciparum merozoites during the invasion process . Band 3, another PEST +ve erythrocyte protein, is eliminated from the site of contact with the merozoite . The involvement of calpain in other cell fusion reactions  and the presence of PEST +ve membrane proteins suggest that this system may be involved in cell adhesion processes in P. falciparum.
As the parasite progresses from the trophozoite to the schizont stage, there is a thirty-fold increase in the level of transcription of calpain . The erythrocyte normally maintains a submicromolar intracellular Ca2+concentration which rises thirty-fold as the parasite matures.  Given the presence of a PEST sequence in DNA origin recognition complex protein and the effects on Ca2+ removal on trophozoite-to-schizont progression, it is tempting to speculate that the lack of Ca2+ may inhibit calpain activation and that this is responsible for the effects seen here [28–32]. Bearing in mind a report for a central role for falcipain 1 in merozoite biology , the increase in transcriptional levels seen in the schizont and the effect of inhibitors on erythrocyte invasion suggest that calpain too may play a role here.
The P. falciparum calpain gene differs significantly from those found to date in vertebrates and this may partly explain the selective toxicity of the inhibitors. There are many calpain inhibitors presently available and the majority of these are small peptides that can be freeze-dried and stored at room temperature. Several have been used in Phase 2 human trials for treatment of myocardial infarction, stroke and cancer. Given the need for novel drugs to treat malaria, the selective toxicity of these inhibitors, several known crystal structures and the possibility of recognising probable target proteins and gene sequence, these agents may be worth exploring further.
Ishiura S, Murofushi H, Suzuki K, Imahori K: Studies of a calcium-activated neutral protease from chicken skeletal muscle. I. Purification and characterization. J Biochem. 1978, 84: 225-230.
Maki M, Kitaura Y, Satoh H, Ohkouchi S, Shibata H: Structures, functions and molecular evolution of the penta-EF-hand Ca2+-binding proteins. Biochim Biophys Acta. 2002, 1600: 51-60. 10.1016/S1570-9639(02)00444-2.
Hata S, Nishi K, Kawamoto T, Lee HJ, Kawahara H, Maeda T, Shintani Y, Sorimachi H, Suzuki K: Both the conserved and the unique gene structure of stomach-specific calpains reveal processes of calpain gene evolution. J Mol Evol. 2001, 53: 191-203. 10.1007/s002390010209.
Schechter I, Berger A: On the size of the active site in proteases. I. Papain. Biochem Biophys Res Commun. 1967, 27: 157-162.
Schad E, Farkas A, Jekely G, Tompa P, Friedrich P: A novel human small subunit of calpains. Biochem J. 2002, 362: 383-388. 10.1042/0264-6021:3620383.
Strobl S, Fernandez-Catalan C, Braun M, Huber R, Masumoto H, Nakagawa K, Irie A, H Sorimachi, G Bourenkow, H Bartunik, Suzuki K, Bode W: The crystal structure of calcium-free human m-calpain suggests an electrostatic switch mechanism for activation by calcium. Proc Natl Acad Sci U S A. 2000, 97: 588-592. 10.1073/pnas.97.2.588.
Hosfield CM, Elce JS, Davies PL, Jia Z: Crystal structure of calpain reveals the structural basis for Ca(2+)-dependent protease activity and a novel mode of enzyme activation. EMBO Journal. 1999, 18: 6880-6889. 10.1093/emboj/18.24.6880.
Todd B, Moore D, Deivanayagam CC, Lin G, Chattopadhyay D, Maki M, Wang KK, Narayana SV: A Structural Model for the Inhibition of Calpain by Calpastatin: Crystal Structures of the Native Domain VI of Calpain and its Complexes with Calpastatin Peptide and a Small Molecule Inhibitor. J Mol Biol. 2003, 328: 131-146. 10.1016/S0022-2836(03)00274-2.
Wang N, Chen W, Linsel-Nitschke P, Martinez LO, Agerholm-Larsen B, Silver DL, Tall AR: A PEST sequence in ABCA1 regulates degradation by calpain protease and stabilization of ABCA1 by apoA-I. J Clin Invest. 2003, 111: 99-107. 10.1172/JCI200316808.
Shumway SD, Maki M, Miyamoto S: The PEST domain of IkappaBalpha is necessary and sufficient for in vitro degradation by mu-calpain. J Biol Chem. 1999, 274: 30874-30881. 10.1074/jbc.274.43.30874.
Rechsteiner M, Rogers SW: PEST sequences and regulation by proteolysis. Trends Biochem Sci. 1996, 21: 267-271. 10.1016/0968-0004(96)10031-1.
Rogers S, Wells R, Rechsteiner M: Amino acid sequences common to rapidly degraded proteins: the PEST hypothesis. Science. 1986, 234: 364-368.
Olaya P, Wasserman M: Effect of calpain inhibitors on the invasion of human erythrocytes by the parasite Plasmodium falciparum. Biochim Biophys Acta. 1991, 1096: 217-221. 10.1016/0925-4439(91)90008-W.
Hanspal M, Goel VK, Oh SS, Chishti AH: Erythrocyte calpain is dispensable for malaria parasite invasion and growth. Mol Biochem Parasitol. 2002, 122: 227-229. 10.1016/S0166-6851(02)00104-4.
Wasserman M, Alarcon C, Mendoza PM: Effects of Ca++ depletion on the asexual cell cycle of Plasmodium falciparum. Am J Trop Med Hyg. 1982, 31: 711-717.
Gardner MJ, Shallom SJ, Carlton JM, Salzberg SL, Nene V, Shoaibi A, Ciecko A, Lynn J, Rizzo M, Weaver B, Jarrahi B, Brenner M, Parvizi B, Tallon L, Moazzez A, Granger D, Fujii C, Hansen C, Pederson J, Feldblyum T, Peterson J, Suh B, Angiuoli S, Pertea M, Allen J, Selengut J, White O, Cummings LM, Smith HO, Adams MD, Venter JC, Carucci DJ, Hoffman SL, Fraser CM: Sequence of Plasmodium falciparum chromosomes 2, 10, 11 and 14. Nature. 2002, 419: 531-534. 10.1038/nature01094.
Hall N, Pain A, Berriman M, Churcher C, Harris B, Harris D, Mungall K, Bowman S, Atkin R, Baker S, Barron A, Brooks K, Buckee CO, Burrows C, Cherevach I, Chillingworth C, Chillingworth T, Christodoulou Z, Clark L, Clark R, Corton C, Cronin A, Davies R, Davis P, Dear P, Dearden F, Doggett J, Feltwell T, Goble A, Goodhead I, Gwilliam R, Hamlin N, Hance Z, Harper D, Hauser H, Hornsby T, Holroyd S, Horrocks P, Humphray S, Jagels K, James KD, Johnson D, Kerhornou A, Knights A, Konfortov B, Kyes S, Larke N, Lawson D, Lennard N, Line A, Maddison M, McLean J, Mooney P, Moule S, Murphy L, Oliver K, Ormond D, Price C, Quail MA, Rabbinowitsch E, Rajandream MA, Rutter S, Rutherford KM, Sanders M, Simmonds M, Seeger K, Sharp S, Smith R, Squares R, Squares S, Stevens K, Taylor K, Tivey A, Unwin L, Whitehead S, Woodward J, Sulston JE, Craig A, Newbold C, Barrell BG: Sequence of Plasmodium falciparum chromosomes 1, 3–9 and 13. Nature. 2002, 419: 527-531. 10.1038/nature01095.
Hyman RW, Fung E, Conway A, Kurdi O, Mao J, Miranda M, Nakao B, Rowley D, Tamaki T, Wang F, Davis RW: Sequence of Plasmodium falciparum chromosome 12. Nature. 2002, 419: 534-537. 10.1038/nature01102.
Kissinger JC, Brunk BP, Crabtree J, Fraunholz MJ, Gajria B, Milgram AJ, Pearson DS, Schug J, Bahl A, Diskin SJ, Ginsburg H, Grant GR, Gupta D, Labo P, Li L, Mailman MD, Mcweeney SK, Whetzel P, Stoeckert CJ, Roos DS: The Plasmodium genome database – Designing and mining a eukaryotic genomics resource. Nature. 2002, 419: 490-492. 10.1038/419490a.
Huestis R, Fischer K: Prediction of many new exons and introns in Plasmodium falciparum chromosome 2. Mol Biochem Parasitol. 2001, 118: 187-199. 10.1016/S0166-6851(01)00376-0.
Kyte J, Doolittle RF: A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982, 157: 105-132.
Wu Y, Wang X, Liu X, Wang Y: Data-mining approaches reveal hidden families of proteases in the genome of malaria parasite. Genome Res. 2003, 13: 601-616. 10.1101/gr.913403.
Chevaillier P: Pest sequences in nuclear proteins. Int J Biochem. 1993, 25: 79-482. 10.1016/0020-711X(93)90653-V.
Rojas FJ, Moretti-Rojas I: Involvement of the calcium-specific protease, calpain, in the fertilizing capacity of human spermatozoa. Int J Androl. 2000, 23: 163-168. 10.1046/j.1365-2605.2000.00221.x.
Foley M, Tilley L, Sawyer WH, Anders RF: The ring-infected erythrocyte surface antigen of Plasmodium falciparum associates with spectrin in the erythrocyte membrane. Mol Biochem Parasitol. 1991, 46: 137-147. 10.1016/0166-6851(91)90207-M.
Dluzewski AR, Fryer PR, Griffiths S, Wilson RJ, Gratzer WB: Red cell membrane protein distribution during malarial invasion. J Cell Sci. 1989, 92: 691-699.
Krogstad DJ, Sutera SP, Marvel JS, Gluzman IY, Boylan CW, Colca JR, Williamson JR, Schlesinger PH: Calcium and the malaria parasite: parasite maturation and the loss of red cell deformability. Blood Cells. 1991, 17: 229-241.
McCallum-Deighton N, Holder AA: The role of calcium in the invasion of human erythrocytes by Plasmodium falciparum. Mol Biochem Parasitol. 1992, 50: 317-323. 10.1016/0166-6851(92)90229-D.
Wasserman M: The role of calcium ions in the invasion of Plasmodium falciparum. Blood Cells. 1990, 16: 450-451.
Wasserman M, Chaparro J: Intraerythrocytic calcium chelators inhibit the invasion of Plasmodium falciparum. Parasitol Res. 1996, 82: 102-107. 10.1007/s004360050078.
Tanabe K, Izumo A, Kato M, Miki A, Doi S: Stage-dependent inhibition of Plasmodium falciparum by potent Ca2+ and calmodulin modulators. J Protozool. 1989, 36: 139-143.
Gazarini ML, Thomas AP, Pozzan T, Garcia CR: Calcium signaling in a low calcium environment: how the intracellular malaria parasite solves the problem. J Cell Biol. 2003, 161: 103-110. 10.1083/jcb.200212130.
Greenbaum DC, Baruch A, Grainger M, Bozdech Z, Medzihradszky KF, Engel J, DeRisi J, Holder AA, Bogyo M: A role for the protease falcipain 1 in host cell invasion by the human malaria parasite. Science. 2002, 298: 2002-2006. 10.1126/science.1077426.
We would like to thank Robert Huestis for his help with the calpain gene annotation and two anonymous referees for their suggestions on an earlier version of this manuscript.
DM conceived of the study, downloaded the data used, wrote the software, analysed the results and wrote the drafts of the manuscript. AB supervised the process, corrected and edited the earlier drafts.
Electronic supplementary material
About this article
Cite this article
Mitchell, D., Bell, A. PEST sequences in the malaria parasite Plasmodium falciparum: a genomic study. Malar J 2, 16 (2003). https://doi.org/10.1186/1475-2875-2-16
- Merozoite Surface Protein
- Calpain Inhibitor
- Origin Recognition Complex
- Apical Membrane Antigen
- Selective Toxicity