The expression of three different DBL-domains and one CIDR-domain as recombinant proteins in E. coli was induced either at an OD A600 of 0.6, which is commonly recommended or at an OD A600 higher than 2.0. SDS-PAGE analysis (Figure 1) shows that most of the recombinant proteins of the cultures induced at a low OD A600 (Figure 1, lane 1, 3, 5, 7) were truncated at the C-terminal end displaying multiple bands of different molecular weights, while the intact protein represents only a small fraction of the overall protein yield. In contrast, if the expression was induced at a higher OD A600 (Figure 1, lane 2, 4, 6, 8), the dominant fraction of the protein was found to be the intact form, which proved to be true for all four domains tested although derived from different PfEMP1s.
In bacterial cultures, the growth will be at log phase between an OD A600 of 0.3 and 1.5. During the log phase, the number of bacteria in the culture doubles approximately every 20 minutes. Afterwards, the proliferation rate slows down due to the lack of nutrients. If the induction is initiated while the bacteria grow in log phase, the bacterial translation machinery will be highly active and the expression of the recombinant protein follows this profile, because once turned on, the promoter controlling the heterologous sequence on the vector does not underlie further control mechanisms. During expression, the rare codons of arginine, leucine, isoleucine and proline frequently found in PfEMP1 sequences will inhibit the translation process, most likely caused by the exhaustion of the tRNAs for these amino acids. It has been reported that the rare codons of arginine and proline are likely to cause frameshifts and with that undesired products in bacterial expression system [21–23]. The data reported here indicate that these problems mainly occur during the high-level expression stage, since proteins expressed at post-log growth stage are much less truncated.
Enzymatic digestion of heterologous proteins in E. coli is thought to be an additional reason for product heterogeneity of recombinant proteins [24]. The experiments of this study could not confirm degradation by bacterial proteases as one of the major causes, since the use of a protease inhibitor cocktail in the purification protocol did not affect the pattern of the expressed products (data not shown). In addition, expression was carried out using a BL21 Codon Plus bacterial strain (Stratagene) that is deficient in the OmpT and Lon bacterial proteases.
We have previously found that a large proportion of the recombinant proteins remain in the insoluble fraction whereas only small amounts appear in the soluble fraction (Figure 2 and data not shown) if expression is initiated at an OD A600 of 0.6. To check whether the bacterial growth status at the induction timepoint has any effect on protein solubility, induction of expression was carried out on aliquots of the same bacterial stock culture at different bacterial densities (OD A600 value). Both soluble and insoluble fractions of the same culture were compared. The results (Figure 2A–C) clearly show that the majority of the three recombinant proteins remain in the insoluble fraction when the expression was induced at an OD A600 below 2.0. If, on the other hand, the induction is initiated at an OD A600 greater than 2.0, almost the total amount of the recombinant proteins appears in the soluble fraction.
The solubility of a protein correlates with its correct structure that is formed during a post-translational folding process. Freshly synthesized polypeptides remain in a stage of intermediate form in the bacterial cytoplasma. After several enzymatic and biochemical processing steps, the peptides are folded into their functional form [24]. However, if proteins are folded incorrectly, they tend to accumulate as aggregates in the bacterial cell and, in order to avoid toxic effects on the host system, the bacteria store these aggregates in confined structures referred to as inclusion bodies.
Formation and accumulation of heterologous proteins as inclusion bodies is a common problem in protein expression. The exact mechanism of this process is still not understood. It has been suggested that factors such as culture pH, temperature and protein amino acid composition might affect the solubility of a recombinant protein [24]. The data reported here indicate that the expression speed and, with that, the subsequent folding process is the most important factor. Protein expression at the post-log phase resulted in high amounts of soluble protein, which indicates that at this stage the low bacterial growth rate implicates a biosynthesis process that is kept at low speed. The slow synthesis process will allow the protein processing machinery to efficiently assemble the freshly synthesized peptides into the correct structure. Correctly folded proteins are most likely to stay in the soluble form provided that the molecule does not contain large numbers of hydrophobic residues.
Although we found that the pH value of the growing culture is influenced by the amino acid composition of the expressed polypeptide, keeping a stable pH value in the bacterial culture does not affect the protein solubility (data not shown) and, therefore, has little influence on the quality of the expressed protein. However, temperature is an important factor to consider. Keeping the culture at 16°C before and after induction slightly improves the protein quality (data not shown), but, on the other hand, slows down bacterial growth considerably and therefore minimizes the final yield of the recombinant protein.
It has been reported that codon-optimized sequences for the use in E. coli will improve expression quality. Here we show that C/G versus A/T contents of the heterologous gene sequence are not among the most important factors that determine the quality of the recombinant protein. The expression of GST-DBL1α of FCR3S1.3 (Figure 2A) optimized for expression in E. coli shows a very similar expression pattern compared to those ones of GST-DBL1α and GST-DBL2β of TM284S2 which were expressed using the wildtype P. falciparum sequences (Figure 2B,C). This indicates that sequence composition is not always a determinant factor for expression quality.
We have previously found that the DBL1α domain of FCR3S1.2var1 PfEMP1 binds to the human erythrocyte surface through heparan sulfate [20, 25]. Further, the recombinant GST-DBL1α of FCR3S1.2 protein can be purified through binding to heparin-sepharose. In this study, the same amount of GST-DBL1α purified from cultures of expression started at an OD A600 of 0.6 and greater than 2.0 was tested for its ability to bind to heparin. Although the truncated forms of the DBL1 display binding to heparin due to the presence of heparin-binding motifs in these peptides, there is a remarkable difference in terms of binding affinity between the proteins expressed at different bacterial densities as shown in Figure 3. Proteins expressed at a high OD A600 are not only more intact and more soluble, but also display higher affinity to heparin.
To further demonstrate functionality of the proteins expressed at high OD A600 the DBL1α of FCR3S1.2 was subjected to a blood group A binding assay, which confirmed the specific interaction between the DBL1α and the blood group A antigen (data not shown).
The expression of P. falciparum derived proteins, especially membrane-bound proteins is still a great challenge due to the high content of amino acids encoded by rare codons in the P. falciparum genome. The method reported here presents an easily applicable tool to express sequences containing rare codons. The key factor for the expression of such proteins is to decelerate the translation machinery inside the bacteria. Low expression speed will not only allow the ribosomal unit to smoothly pass through the mRNA templates and synthesize full-length polypeptide chains, but also enable the proteins to slowly transfer from the unstable intermediate phase to the correctly folded phase. The described expression approach will result in a final product that is soluble, intact and functional, nevertheless, additional factors might influence the expression and need to be optimized for each individual construct.
The expression of eukaryotic genes in E. coli is one of the most frequently used tools in modern science. Numerous approaches have aimed at achieving the highest possible level of expression by having a maximum amount of protein expressed per bacterial cell. Our studies suggest on the contrary that increasing the number of bacterial cells in the culture while at the same time keeping the expression process at a low profile, might considerably improve the quality and quantity of the protein. That way, high level expression can simply be achieved by increasing the bacterial density of a culture, whereby problems in form of truncated or insoluble protein factions are almost completely eliminated.