Conclusion
The Attention U-Net convolutional architecture of ProteinUnet was shown
to predict SS8 much better when using features from protein language
models than features from MSA (as in the previous ProteinUnet2 network).
Our experiments suggest that it has at least as good prediction quality
as SPOT-1D-LM while being much faster; as fast as NetSurfP-3.0 which
achieves much worse results. It supports our hypothesis that
state-of-the-art in SS8 prediction can be achieved without using LSTM
networks. Additionally, ProteinUnetLM has better results than the
ProtT5Sec classifier which suggests that our architecture provided a
significant improvement over this simple fully-convolutional network.
Our focus on the issue of imbalance in SS8 prediction, i.e., by
adjusting the loss function of the network, allowed ProteinUnetLM to
achieve state-of-the-art results in AGM metric by providing results
competitive with AlphaFold2, and by dominating all other networks on
CASP14 dataset. ProteinUnetLM can be considered one of the most
efficient (prediction time shorter than 200 ms per sequence) and
effective (macro-AGM 0.653–0.829, SOV8 0.648–0.786, Q8 0.651–0.771,
depending on the test set) networks for predicting rare secondary
structures, such as 310-helix (G), beta-bridge (B) and high curvature
loop (S) while maintaining high performance for other structures. It can
be run even on computers without GPU, so it is an ideal solution for
embedded chips, mobile devices, and low-end computers. To support the
reproducibility of the research and to encourage the community to adopt
our network, we shared models and a complete code (for both training and
inference), and an easy-to-use web interface.
The only limitation of Attention U-Net in comparison to LSTMs is the
limited size of the input sequence (i.e., 704 residues). However, such
long sequences rarely occur in nature and they can be still predicted by
Attention U-Net in fragments if necessary. Moreover, an additional
performance boost can be achieved by training an ensemble of 10
ProteinUnetLM models in the way described in the ProteinUnet2
publication 31. The architecture can be easily
extended to predict torsion angles and protein features like half sphere
exposure, accessible solvent area, or contact number, as presented in
the first ProteinUnet publication 36. We plan to
extend ProteinUnetLM with those outputs to enhance the utility of our
network. Also, we plan to train a larger version of ProteinUnetLM on
much larger datasets to reach AlphaFold2 accuracy. We hope that the
thesis stated in the title will inspire researchers to apply Attention
U-Net architecture to eliminate less energy-efficient LSTMs in other
domains beyond bioinformatics, such as the classification of
electroencephalography signals 57.