Tuning protein expression and yield in yeast using genetic elements and fusion tags
High-yield protein expression often employs rigorous expression conditions to achieve the highest yields possible for a target construct. Under these conditions, the rate of protein synthesis exceeds the capacity of the intracellular protein quality control machinery leading to misfolding and premature protein degradation. In the case of the former, protein misfolding leads to the accumulation of aggregated material that collects inside the expression host called inclusion bodies. Inclusion bodies (IBs) are insoluble and resistant to proteolytic degradation. This can be advantageous if a target IMP is susceptible to premature degradation during expression or if overexpression leads to premature cell death, which is common in the case of MPs (Kesidis et al., 2020). Proteins can be coaxed into IBs by incorporating certain fusion tags into the gene expression construct (Esposito & Chatterjee, 2006). The use of fusion tags has seen much use in high-yield protein expression primarily as a handle for post-production purification. These Include the common hexahistidine tag (His6), Myc, thioredoxin (Trx), and maltose binding protein (MBP). Including either solubility or IB-directing tags within the gene expression cassette also has substantial effects on protein yields (Costa et al., 2014; Ki & Pack, 2020). Most reports of IB formation occur in E. coli and it is generally considered an undesired outcome of protein aggregation that results from poor expression conditions and the lack of PTMs. Where E. coli falls short in expression efforts, yeast has proved to be more successful (Cai et al., 2019; Duman-Özdamar & Binay, 2021). Taking advantage of IB formation can be used to effectively increase expression yields and is useful when a feasible refolding procedure has been established (Bhatwa et al., 2021). The use of IB tags for S. cerevisiae expression is not as well described, but there is evidence of IB formation under specific conditions depending on the expression target (Binder et al., 1991; Rueda et al., 2016). Interestingly, two identified prionogenic proteins which originated in yeast, were shown to form inclusion bodies when expressed in E coli. (Espargaró et al., 2012). Re-purposing these proteins, Sup35 and Ure2, may provide one possible strategy to initiate IB formation when used as a fusion tag for proteins expressed inE. coli and S. cerevisiae .
Directing protein expression, primarily MPs, into IBs increases yield, but does little to preserve protein structure and function. Solubilizing and refolding aggregated protein in the absence of chaperones can be complicated as protein size increases (molecular weight) due to the increasing number of associated protein folding intermediates that can be assumed (Kiefhaber et al., 1991; Mitraki & King, 1989; Silow & Oliveberg, 1997). Again, this is further complicated by any native PTMs required for proper structure and function (Roth et al., 2010; Shental-Bechor & Levy, 2008). In S. cerevisiae , the presence of the secretory pathway is accompanied by the unfolded protein response (UPR) that is triggered under stress-induced conditions (Gardner et al., 2013; Ng et al., 2000; Walter & Ron, 2011). High-level expression conditions can create an environment where protein misfolding is more likely to occur. A cascade of intracellular events results that can either increase the expression of molecular folding chaperones or it can lead to premature death through activation of apoptotic pathways (Hetz et al., 2020; Walter & Ron, 2011). Finding alternate ways to relieve cell-induced stress is necessary to avoid these unwanted outcomes and achieve maximum yields. Toward this end, attenuating induced expression conditions are required, which can be accomplished by choosing a compatible expression plasmid and tailoring incorporated genetic elements to optimize this process through better control.
Protein expression plasmids comprised of small circular pieces of DNA that are introduced to either bacterial or yeast organisms and are incorporated into the phenotype of the host expression system. The native transcription and translation machinery of the host is hijacked to produce the product of the target genes included within the plasmid DNA, usually at levels higher than the homeostatic levels of the native genes in the host genome. When successful conditions have been established, this results in high protein yields. Predicting optimal expression conditions ab initio is nearly impossible considering the intricate interplay of multiple processes involved. Rigorous expression conditions are usually employed, but this can lead to undesired outcomes including premature cell death. The use of genetic elements such as promoters to induce protein expression and impose more control on the process is one way to potentially mitigate this effect. Choice of plasmid also influences protein expression levels by their copy number within a cell and mitotic stability, which affects their ability to replicate through cell division during propagation. Plasmid copy number also affects protein expression levels and it has been suggested that when coupled with a strong constitutive promoter, plasmid copy number is reduced along with plasmid stability, which is enough to offset expression “gains” achieved by a strong promoter (Stueber & Bujard, 1982). Although, the use of auxotrophic selection markers has a much greater effect on plasmid copy number (Karim et al., 2013). A variety of commonly used plasmids are readily available for protein expression in yeast. These include pRS plasmids and their variants, pYES, pESC and many others which are summarized by Da Silva and Srikrishnan (Da Silva & Srikrishnan, 2012; Fang et al., 2011). Many of the plasmids in these series spawned from the popular pRS plasmids and differ by slight variations in their promoter regions as well as other features including various selection markers and incorporated tags for assaying expression and intracellular trafficking or for purification. Promoters can be classified as either constitutive or inducible. While constitutive promoters; TEF1, ADH1, and GDP, offer a greater dynamic range of expression, inducible promoters, FIG1 and GAL, enable greater control over timing and transcriptional regulation of expression, which is especially useful when expressing targets that are inherently toxic to the host. MPs fall into this category. When they are overexpressed and trafficked to the plasma membrane, this can lead to membrane overcrowding causing destabilizing effects (Guigas & Weiss, 2016; Löwe et al., 2020; H.-X. Zhou, 2009). Waiting to induce expression until higher cell densities are achieved is a way to ensure expression will not trigger premature cell death before higher yields can be obtained.
Efforts continue to be directed toward engineering new synthetic promoters to optimize expression by incorporating elements that allow for greater transcriptional control (Alper et al., 2005). Synthetic promoters usually include hybrid features of the native promoter along with a tunable site upstream of the core promoter that can act to suppress its activity. This is effective particularly for a strong promoter such as GAL1 (Mazumder & McMillen, 2014). Including repressor sites can also change a constitutive promoter, PFY1, to an inducible one using a similar approach, by incorporating regulatory elements upstream of the main promoter. Alternatively, hybrid promoters take elements from at least two different promoters, combining them into a single new promoter with altered activity and regulation (Blazeck et al., 2012). Recombining promoter regions requires a detailed understanding of different native yeast promoters and regulatory elements. Some promoters, the tightly-regulated HO promoter for example, are only active in certain cell types and respond to transcription factor initiation under specific conditions. The HO promoter is activated only in mother-type yeast cells in response to the SWI5 transcription factor during G1 phase of cell cycle division (Nasmyth et al., 1990). This adds an additional layer of protein expression control through the use of transcription factors, which interact with promoter and repressor regions to up- or down-regulate expression. This further expands the capacity for fine tuning. Transcription factors are native to all organisms and their identification has traditionally encompassed experimental characterization (Ian A. Taylor et al., 2000; Vachon et al., 2013). These methods have largely been supplanted using genome sequencing and mapping methods along with bioinformatics to help identify conserved regulatory regions across species (Hahn & Young, 2011; Yu, 2006). To understand transcription activation or repression of a specific promoter region requires a priori knowledge of their native function, which genome mapping has helped elucidate. These efforts have resulted in the curation of libraries of promoters enabling high-throughput screening and the construction of synthetic promoters (Gordân et al., 2011). Beyond that, functional assessment is needed to assay transcription factor and regulatory motif compatibility with the host organism expression machinery. This can be accomplished using a fluorescence reporter gene to measure expression levels or by measuring mRNA (Blazeck et al., 2012). Using a repressor in conjunction with a promoter provides a means of tighter control over inducing expression and could help to avoid the negative effects that result from overwhelming the intracellular expression machinery under rigorous expression conditions, this includes when using a strong promoter. Coupled with an in-depth analysis of expression for a target construct, a linked repressor can be activated at different time points throughout the period of expression to avoid aggravating the UPR or inducing apoptosis (Kaneko & Nomura, 2003).
Another strategy to allow for better control over protein expression is by engineering non-native terminator sequences. Though, perhaps not as commonly addressed as a means of tailoring protein expression, adjusting terminator sequences can impact the completion of transcription, dissociation and recycling of transcription machinery and other important parameters such as mRNA half-life. They can be versatility introduced across different yeast expression platforms like industrial,Y. lipolytica , and they can be completely synthetically constructed resembling few to no features of any native terminator sequences (Curran et al., 2015). Short terminator sequences have been engineered for yeast and showed increased protein expression levels by 3.7-fold when compared with the commonly used CYC1 terminator in yeast. A weak terminator can lead to transcriptional read-through and delayed dissociation of RNA polymerase. In E. coli , it has been suggested lower transcription efficiency can lead to slowed expression, reducing the chance of overwhelming the translation machinery (Swartz, 2001). Using similar rationale, this may be necessary to attenuate the effects of a strong promoter and avoid triggering a stress-related cellular response such as misfolding, degradation (mRNA and protein) and cell death. One other approach that has been used in P. pastoris(syn. K. phaffii ) using hybrid promoters that contain both bacterial and yeast derived elements (Liu et al., n.d.). The purpose is to employ a yeast species that could produce proteins with desired PTMs rather than the hyperglycosylation S. cerevisiae is known for, while adapting it to high-level expression by introducing promoter features from a related high-producing species. Using a synthetic hybrid promoter along with a transactivator enabled methanol-free activation of protein production and increased the yield of a recombinant α-amylase expression target produced in P. pastoris ( syn. K. phaffii ). This avoided the need to use toxic methanol to induce expression.