Conclusion

The Attention U-Net convolutional architecture of ProteinUnet was shown to predict SS8 much better when using features from protein language models than features from MSA (as in the previous ProteinUnet2 network). Our experiments suggest that it has at least as good prediction quality as SPOT-1D-LM while being much faster; as fast as NetSurfP-3.0 which achieves much worse results. It supports our hypothesis that state-of-the-art in SS8 prediction can be achieved without using LSTM networks. Additionally, ProteinUnetLM has better results than the ProtT5Sec classifier which suggests that our architecture provided a significant improvement over this simple fully-convolutional network. Our focus on the issue of imbalance in SS8 prediction, i.e., by adjusting the loss function of the network, allowed ProteinUnetLM to achieve state-of-the-art results in AGM metric by providing results competitive with AlphaFold2, and by dominating all other networks on CASP14 dataset. ProteinUnetLM can be considered one of the most efficient (prediction time shorter than 200 ms per sequence) and effective (macro-AGM 0.653–0.829, SOV8 0.648–0.786, Q8 0.651–0.771, depending on the test set) networks for predicting rare secondary structures, such as 310-helix (G), beta-bridge (B) and high curvature loop (S) while maintaining high performance for other structures. It can be run even on computers without GPU, so it is an ideal solution for embedded chips, mobile devices, and low-end computers. To support the reproducibility of the research and to encourage the community to adopt our network, we shared models and a complete code (for both training and inference), and an easy-to-use web interface.
The only limitation of Attention U-Net in comparison to LSTMs is the limited size of the input sequence (i.e., 704 residues). However, such long sequences rarely occur in nature and they can be still predicted by Attention U-Net in fragments if necessary. Moreover, an additional performance boost can be achieved by training an ensemble of 10 ProteinUnetLM models in the way described in the ProteinUnet2 publication 31. The architecture can be easily extended to predict torsion angles and protein features like half sphere exposure, accessible solvent area, or contact number, as presented in the first ProteinUnet publication 36. We plan to extend ProteinUnetLM with those outputs to enhance the utility of our network. Also, we plan to train a larger version of ProteinUnetLM on much larger datasets to reach AlphaFold2 accuracy. We hope that the thesis stated in the title will inspire researchers to apply Attention U-Net architecture to eliminate less energy-efficient LSTMs in other domains beyond bioinformatics, such as the classification of electroencephalography signals 57.