3.4 | Microproteins with predicted structures
With the advent of three-dimensional macromolecular structure prediction
tools such as Rosetta, iTasser, Phyre, and, most recently, AlphaFold,
many recently discovered, now-annotated microproteins have been
subjected to computational structure prediction, and these structural
models are publicly available. For microproteins that remain
unannotated, computational tools can be used to generate testable
structural predictions. For example, analysis of the recently identifiedE. coli cold-shock microprotein YmcF using iTasser led to the
hypothesis that YmcF may adopt a folded structure consisting of an alpha
helix and 2-3 beta strands separated by a turn, homologous to a
zinc-binding domain of aspartate transcarbamoylase (Figure 4G). While no
functional data for YmcF yet exists, this predicted structural model, if
correct, may have implications in the cold shock response, which
requires RNA binding proteins—some of which coordinate zinc—to
chaperone RNA secondary structures that become hyper-stable at low
temperature. In another example, plant microProteins are specifically
defined as proteins predicted to fold into single domains that bind to
and generally antagonize the functions of their effectors, such as
transcription factors.
Predicted structures of microproteins have already begun to aid in
determining their molecular and cellular functions. A translated
upstream ORF (uORF) encoding a 96-amino acid microprotein within the 5′
untranslated region (UTR) of the human ASNSD1 gene was reported
by Oyama et al. in 2007 and in subsequent proteomic analyses, leading to
the annotation of the microprotein as ASDURF (ASNSD1 upstream open
reading frame). As discussed above, evidence is accumulating that uORF
microproteins can function in trans. Remarkably, Coulombe and
colleagues recently implicated ASDURF as the “missing” subunit of a
chaperone complex termed the PAQosome. Proximity biotinylation and
pull-down experiments with PAQosome subunits revealed ASDURF as an
interaction partner, and in vitro reconstitution assays suggested that
it is an integral member of a PAQosome subcomplex. The PAQosome is a
recently discovered chaperone that is essential for assembling
complicated macromolecular complexes in the cell, including RNA
polymerases, components of the spliceosome, and protein phosphatases.
The PAQosome consists of two modules, one of which is termed the
prefoldin-like (PFDL) module. The PFDL module shares some subunits and
putative structural homology to prefoldin, another cellular chaperone
required for folding cytoskeletal proteins and other clients. Prefoldin
and the PFDL module are both hexameric, consisting of three alpha- and
three beta-prefoldin subunits, which both contain an alpha-helical
coiled-coil separated by either one (beta) or two (alpha) hairpins;
however, only five of the six PFDL subunits (three alpha and two beta)
had been identified. Tertiary structure modeling with Phyre suggested
that ASDURF is a beta-prefoldin bearing a single beta hairpin and
coiled-coil (Figure 4H), consistent with its potential identification as
the undiscovered beta subunit of the PFDL module of the PAQosome –
suggesting it had been missed because it was not part of the proteome
annotation at the time of the PAQosome’s discovery. Many additional
interesting questions are raised by the ASDURF microprotein: Why is it
encoded in an upstream ORF within the ASNSD1 gene? Does its 5′
UTR location confer stress responsiveness via translational regulation,
as suggested by Cloutier et al.? Is its function or regulation related
to the downstream ASNSD1 protein, per the model of Chen, Weissman and
colleagues that co-encoded microproteins and proteins tend to function
in the same pathways? Regardless, while the structural model requires
experimental validation, it appears that ASDURF is a particularly
compelling example of a microprotein for which structure prediction
informs its interactions and likely function.