Tuning protein expression and yield in yeast using genetic
elements and fusion tags
High-yield protein expression often employs rigorous expression
conditions to achieve the highest yields possible for a target
construct. Under these conditions, the rate of protein synthesis exceeds
the capacity of the intracellular protein quality control machinery
leading to misfolding and premature protein degradation. In the case of
the former, protein misfolding leads to the accumulation of aggregated
material that collects inside the expression host called inclusion
bodies. Inclusion bodies (IBs) are insoluble and resistant to
proteolytic degradation. This can be advantageous if a target IMP is
susceptible to premature degradation during expression or if
overexpression leads to premature cell death, which is common in the
case of MPs (Kesidis et al., 2020). Proteins can be coaxed into IBs by
incorporating certain fusion tags into the gene expression construct
(Esposito & Chatterjee, 2006). The use of fusion tags has seen much use
in high-yield protein expression primarily as a handle for
post-production purification. These Include the common hexahistidine tag
(His6), Myc, thioredoxin (Trx), and maltose binding protein (MBP).
Including either solubility or IB-directing tags within the gene
expression cassette also has substantial effects on protein yields
(Costa et al., 2014; Ki & Pack, 2020). Most reports of IB formation
occur in E. coli and it is generally considered an undesired
outcome of protein aggregation that results from poor expression
conditions and the lack of PTMs. Where E. coli falls short in
expression efforts, yeast has proved to be more successful (Cai et al.,
2019; Duman-Özdamar & Binay, 2021). Taking advantage of IB formation
can be used to effectively increase expression yields and is useful when
a feasible refolding procedure has been established (Bhatwa et al.,
2021). The use of IB tags for S. cerevisiae expression is not as
well described, but there is evidence of IB formation under specific
conditions depending on the expression target (Binder et al., 1991;
Rueda et al., 2016). Interestingly, two identified prionogenic proteins
which originated in yeast, were shown to form inclusion bodies when
expressed in E coli. (Espargaró et al., 2012). Re-purposing these
proteins, Sup35 and Ure2, may provide one possible strategy to initiate
IB formation when used as a fusion tag for proteins expressed inE. coli and S. cerevisiae .
Directing protein expression, primarily MPs, into IBs increases yield,
but does little to preserve protein structure and function. Solubilizing
and refolding aggregated protein in the absence of chaperones can be
complicated as protein size increases (molecular weight) due to the
increasing number of associated protein folding intermediates that can
be assumed (Kiefhaber et al., 1991; Mitraki & King, 1989; Silow &
Oliveberg, 1997). Again, this is further complicated by any native PTMs
required for proper structure and function (Roth et al., 2010;
Shental-Bechor & Levy, 2008). In S. cerevisiae , the presence of
the secretory pathway is accompanied by the unfolded protein response
(UPR) that is triggered under stress-induced conditions (Gardner et al.,
2013; Ng et al., 2000; Walter & Ron, 2011). High-level expression
conditions can create an environment where protein misfolding is more
likely to occur. A cascade of intracellular events results that can
either increase the expression of molecular folding chaperones or it can
lead to premature death through activation of apoptotic pathways (Hetz
et al., 2020; Walter & Ron, 2011). Finding alternate ways to relieve
cell-induced stress is necessary to avoid these unwanted outcomes and
achieve maximum yields. Toward this end, attenuating induced expression
conditions are required, which can be accomplished by choosing a
compatible expression plasmid and tailoring incorporated genetic
elements to optimize this process through better control.
Protein expression plasmids comprised of small circular pieces of DNA
that are introduced to either bacterial or yeast organisms and are
incorporated into the phenotype of the host expression system. The
native transcription and translation machinery of the host is hijacked
to produce the product of the target genes included within the plasmid
DNA, usually at levels higher than the homeostatic levels of the native
genes in the host genome. When successful conditions have been
established, this results in high protein yields. Predicting optimal
expression conditions ab initio is nearly impossible considering
the intricate interplay of multiple processes involved. Rigorous
expression conditions are usually employed, but this can lead to
undesired outcomes including premature cell death. The use of genetic
elements such as promoters to induce protein expression and impose more
control on the process is one way to potentially mitigate this effect.
Choice of plasmid also influences protein expression levels by their
copy number within a cell and mitotic stability, which affects their
ability to replicate through cell division during propagation. Plasmid
copy number also affects protein expression levels and it has been
suggested that when coupled with a strong constitutive promoter, plasmid
copy number is reduced along with plasmid stability, which is enough to
offset expression “gains” achieved by a strong promoter (Stueber &
Bujard, 1982). Although, the use of auxotrophic selection markers has a
much greater effect on plasmid copy number (Karim et al., 2013). A
variety of commonly used plasmids are readily available for protein
expression in yeast. These include pRS plasmids and their variants,
pYES, pESC and many others which are summarized by Da Silva and
Srikrishnan (Da Silva & Srikrishnan, 2012; Fang et al., 2011). Many of
the plasmids in these series spawned from the popular pRS plasmids and
differ by slight variations in their promoter regions as well as other
features including various selection markers and incorporated tags for
assaying expression and intracellular trafficking or for purification.
Promoters can be classified as either constitutive or inducible. While
constitutive promoters; TEF1, ADH1, and GDP, offer a greater dynamic
range of expression, inducible promoters, FIG1 and GAL, enable greater
control over timing and transcriptional regulation of expression, which
is especially useful when expressing targets that are inherently toxic
to the host. MPs fall into this category. When they are overexpressed
and trafficked to the plasma membrane, this can lead to membrane
overcrowding causing destabilizing effects (Guigas & Weiss, 2016; Löwe
et al., 2020; H.-X. Zhou, 2009). Waiting to induce expression until
higher cell densities are achieved is a way to ensure expression will
not trigger premature cell death before higher yields can be obtained.
Efforts continue to be directed toward engineering new synthetic
promoters to optimize expression by incorporating elements that allow
for greater transcriptional control (Alper et al., 2005). Synthetic
promoters usually include hybrid features of the native promoter along
with a tunable site upstream of the core promoter that can act to
suppress its activity. This is effective particularly for a strong
promoter such as GAL1 (Mazumder & McMillen, 2014). Including repressor
sites can also change a constitutive promoter, PFY1, to an inducible one
using a similar approach, by incorporating regulatory elements upstream
of the main promoter. Alternatively, hybrid promoters take elements from
at least two different promoters, combining them into a single new
promoter with altered activity and regulation (Blazeck et al., 2012).
Recombining promoter regions requires a detailed understanding of
different native yeast promoters and regulatory elements. Some
promoters, the tightly-regulated HO promoter for example, are
only active in certain cell types and respond to transcription factor
initiation under specific conditions. The HO promoter is
activated only in mother-type yeast cells in response to the SWI5
transcription factor during G1 phase of cell cycle division (Nasmyth et
al., 1990). This adds an additional layer of protein expression control
through the use of transcription factors, which interact with promoter
and repressor regions to up- or down-regulate expression. This further
expands the capacity for fine tuning. Transcription factors are native
to all organisms and their identification has traditionally encompassed
experimental characterization (Ian A. Taylor et al., 2000; Vachon et
al., 2013). These methods have largely been supplanted using genome
sequencing and mapping methods along with bioinformatics to help
identify conserved regulatory regions across species (Hahn & Young,
2011; Yu, 2006). To understand transcription activation or repression of
a specific promoter region requires a priori knowledge of their
native function, which genome mapping has helped elucidate. These
efforts have resulted in the curation of libraries of promoters enabling
high-throughput screening and the construction of synthetic promoters
(Gordân et al., 2011). Beyond that, functional assessment is needed to
assay transcription factor and regulatory motif compatibility with the
host organism expression machinery. This can be accomplished using a
fluorescence reporter gene to measure expression levels or by measuring
mRNA (Blazeck et al., 2012). Using a repressor in conjunction with a
promoter provides a means of tighter control over inducing expression
and could help to avoid the negative effects that result from
overwhelming the intracellular expression machinery under rigorous
expression conditions, this includes when using a strong promoter.
Coupled with an in-depth analysis of expression for a target construct,
a linked repressor can be activated at different time points throughout
the period of expression to avoid aggravating the UPR or inducing
apoptosis (Kaneko & Nomura, 2003).
Another strategy to allow for better control over protein expression is
by engineering non-native terminator sequences. Though, perhaps not as
commonly addressed as a means of tailoring protein expression, adjusting
terminator sequences can impact the completion of transcription,
dissociation and recycling of transcription machinery and other
important parameters such as mRNA half-life. They can be versatility
introduced across different yeast expression platforms like industrial,Y. lipolytica , and they can be completely synthetically
constructed resembling few to no features of any native terminator
sequences (Curran et al., 2015). Short terminator sequences have been
engineered for yeast and showed increased protein expression levels by
3.7-fold when compared with the commonly used CYC1 terminator in
yeast. A weak terminator can lead to transcriptional read-through and
delayed dissociation of RNA polymerase. In E. coli , it has been
suggested lower transcription efficiency can lead to slowed expression,
reducing the chance of overwhelming the translation machinery (Swartz,
2001). Using similar rationale, this may be necessary to attenuate the
effects of a strong promoter and avoid triggering a stress-related
cellular response such as misfolding, degradation (mRNA and protein) and
cell death. One other approach that has been used in P. pastoris(syn. K. phaffii ) using hybrid promoters that contain both
bacterial and yeast derived elements (Liu et al., n.d.). The purpose is
to employ a yeast species that could produce proteins with desired PTMs
rather than the hyperglycosylation S. cerevisiae is known for,
while adapting it to high-level expression by introducing promoter
features from a related high-producing species. Using a synthetic hybrid
promoter along with a transactivator enabled methanol-free activation of
protein production and increased the yield of a recombinant α-amylase
expression target produced in P. pastoris ( syn. K.
phaffii ). This avoided the need to use toxic methanol to induce
expression.