References

1. Kabsch W, Sander C. Dictionary of protein secondary structure: Pattern recognition of hydrogen‐bonded and geometrical features.Biopolymers . 1983;22:2577-2637.
2. Liu J, Rost B. Comparing function and structure between entire proteomes. Protein Science . 2001;10(10):1970-1979. doi:10.1110/ps.10101
3. McGuffin LJ, Bryson K, Jones DT. What are the baselines for protein fold recognition? Bioinformatics . 2001;17(1):63-72. doi:10.1093/bioinformatics/17.1.63
4. Meiler J, Baker D. Coupled prediction of protein secondary and tertiary structure. Proc Natl Acad Sci U S A . 2003;100(21):12105-12110. doi:10.1073/pnas.1831973100
5. Hung LH, Samudrala R. PROTINFO: secondary and tertiary protein structure prediction. Nucleic Acids Research . 2003;31(13):3296-3299. doi:10.1093/nar/gkg541
6. Myers JK, Oas TG. Preorganized secondary structure as an important determinant of fast protein folding. Nat Struct Mol Biol . 2001;8(6):552-558. doi:10.1038/88626
7. Gardy JL, Spencer C, Wang K, et al. PSORT-B: improving protein subcellular localization prediction for Gram-negative bacteria.Nucleic Acids Research . 2003;31(13):3613-3617. doi:10.1093/nar/gkg602
8. Van Domselaar GH, Stothard P, Shrivastava S, et al. BASys: a web server for automated bacterial genome annotation. Nucleic Acids Research . 2005;33(suppl_2):W455-W459. doi:10.1093/nar/gki593
9. Mewes HW, Frishman D, Mayer KFX, et al. MIPS: analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Research . 2006;34(suppl_1):D169-D172. doi:10.1093/nar/gkj148
10. Wishart DS, Case DA. Use of Chemical Shifts in Macromolecular Structure Determination. In: James TL, Dötsch V, Schmitz U, eds.Methods in Enzymology . Vol 338. Nuclear Magnetic Resonance of Biological Macromolecules Part A. Academic Press; 2002:3-34. doi:10.1016/S0076-6879(02)38214-4
11. Smolarczyk T, Roterman-Konieczna I, Stapor K. Protein Secondary Structure Prediction: A Review of Progress and Directions. Current Bioinformatics . 2020;15:90-107.
12. Chou PY, Fasman GD. Prediction of protein conformation.Biochemistry . 1974;13:222-245.
13. Garnier J, Osguthorpe DJ, Robson B. Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. Journal of Molecular Biology . 1978;120:97-120.
14. Lim VI. Algorithms for prediction of α-helical and β-structural regions in globular proteins. Journal of Molecular Biology . 1974;88:873-894.
15. Jiang Q, Jin X, Lee SJ, Yao S. Protein secondary structure prediction: A survey of the state of the art. Journal of Molecular Graphics and Modelling . 2017;76:379-402. 10.1016/j.jmgm.2017.07.015.
16. Rost B, Sander C. Improved prediction of protein secondary structure by use of sequence profiles and neural networks. Proceedings of the National Academy of Sciences of the United States of America . 1993;90:7558-7562.
17. Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast iterative protein sequence searching by HMM-HMM alignment. Nature Methods . 2012;9:173-175.
18. Jones DT. Protein secondary structure prediction based on position-specific scoring matrices. Journal of Molecular Biology . 1999;292:195-202.
19. Klausen MS, Jespersen MC, Nielsen H, Jensen KK, Jurtz VI. NetSurfP-2.0: Improved prediction of protein structural features by integrated deep learning. Proteins: Structure, Function, and Bioinformatics . 2019;87:520-527.
20. Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y. Improving prediction of protein secondary structure, backbone angles, solvent accessibility and contact numbers by using predicted contact maps and an ensemble of recurrent and residual convolutional neural networks. Valencia A, ed.Bioinformatics . 2019;35(14):2403-2410. doi:10.1093/bioinformatics/bty1006
21. Ofer D, Brandes N, Linial M. The language of proteins: NLP, machine learning & protein sequences. Computational and Structural Biotechnology Journal . 2021;19:1750-1758. doi:10.1016/j.csbj.2021.03.022
22. Heinzinger M, Elnaggar A, Wang Y, et al. Modeling aspects of the language of life through transfer-learning protein sequences. BMC Bioinformatics . 2019;20(1):723. doi:10.1186/s12859-019-3220-8
23. Elnaggar A, Heinzinger M, Dallago C, et al. ProtTrans: Towards Cracking the Language of Lifes Code Through Self-Supervised Deep Learning and High Performance Computing. IEEE Trans Pattern Anal Mach Intell . Published online 2021:1-1. doi:10.1109/TPAMI.2021.3095381
24. Rives A, Meier J, Sercu T, et al. Biological structure and function emerge from scaling unsupervised learning to 250 million protein sequences. Proc Natl Acad Sci USA . 2021;118(15):e2016239118. doi:10.1073/pnas.2016239118
25. Vig J, Madani A, Varshney LR, Xiong C, socher richard, Rajani N. {BERT}ology Meets Biology: Interpreting Attention in Protein Language Models. In: International Conference on Learning Representations . ; 2021. https://openreview.net/forum?id=YWtLZvLmud7
26. Høie MH, Kiehl EN, Petersen B, et al. NetSurfP-3.0: accurate and fast prediction of protein structural features by protein language models and deep learning. Nucleic Acids Research . 2022;50(W1):W510-W515. doi:10.1093/nar/gkac439
27. Singh J, Paliwal K, Litfin T, Singh J, Zhou Y. Reaching alignment-profile-based accuracy in predicting protein secondary and tertiary structural properties without alignment. Sci Rep . 2022;12(1):7607. doi:10.1038/s41598-022-11684-w
28. Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold. Nature . 2021;596(7873):583-589. doi:10.1038/s41586-021-03819-2
29. Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)—Round XIV. Proteins: Structure, Function, and Bioinformatics . 2021;89(12):1607-1617. doi:https://doi.org/10.1002/prot.26237
30. Stevens AO, He Y. Benchmarking the Accuracy of AlphaFold 2 in Loop Structure Prediction. Biomolecules . 2022;12(7):985. doi:10.3390/biom12070985
31. Stapor K, Kotowski K, Smolarczyk T, Roterman I. Lightweight ProteinUnet2 network for protein secondary structure prediction: a step towards proper evaluation. BMC Bioinformatics . 2022;23(1):100. doi:10.1186/s12859-022-04623-z
32. AlQuraishi M. ProteinNet: a standardized data set for machine learning of protein structure. BMC Bioinformatics . 2019;20(1):311. doi:10.1186/s12859-019-2932-0
33. Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat Methods . 2021;18(2):203-211. doi:10.1038/s41592-020-01008-z
34. Kotowski K, Adamski S, Machura B, Zarudzki L, Nalepa J. Coupling nnU-Nets with Expert Knowledge for Accurate Brain Tumor Segmentation from MRI. In: Crimi A, Bakas S, eds. Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries . Lecture Notes in Computer Science. Springer International Publishing; 2022:197-209. doi:10.1007/978-3-031-09002-8_18
35. Isensee F, Ulrich C, Wald T, Maier-Hein KH. Extending nnU-Net is all you need. Published online August 23, 2022. doi:10.48550/arXiv.2208.10791
36. Kotowski K, Smolarczyk T, Roterman‐Konieczna I, Stapor K. ProteinUnet—An efficient alternative to SPIDER3-single for sequence-based prediction of protein secondary structures. Journal of Computational Chemistry . 2021;42(1):50-59. doi:https://doi.org/10.1002/jcc.26432
37. Vaswani A, Shazeer N, Parmar N, et al. Attention is All you Need. In: Advances in Neural Information Processing Systems . Vol 30. Curran Associates, Inc.; 2017. Accessed September 14, 2022. https://papers.nips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
38. Oktay O, Schlemper J, Folgoc LL, et al. Attention U-Net: Learning Where to Look for the Pancreas. In: Medical Imaging with Deep Learning . ; 2018. https://openreview.net/forum?id=Skft7cijM
39. Rao R, Meier J, Sercu T, Ovchinnikov S, Rives A. Transformer protein language models are unsupervised structure learners. Published online December 15, 2020:2020.12.15.422761. doi:10.1101/2020.12.15.422761
40. Chicco D, Jurman G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics . 2020;21(1):6. doi:10.1186/s12864-019-6413-7
41. Chicco D, Tötsch N, Jurman G. The Matthews correlation coefficient (MCC) is more reliable than balanced accuracy, bookmaker informedness, and markedness in two-class confusion matrix evaluation. BioData Mining . 2021;14(1):13. doi:10.1186/s13040-021-00244-z
42. Abhishek K, Hamarneh G. Matthews Correlation Coefficient Loss for Deep Convolutional Networks: Application to Skin Lesion Segmentation. Published online February 20, 2021. Accessed September 18, 2022. http://arxiv.org/abs/2010.13454
43. Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. In: Bengio Y, LeCun Y, eds. 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings . ; 2015. http://arxiv.org/abs/1412.6980
44. Batuwita R, Palade V. Adjusted geometric-mean: a novel performance measure for imbalanced bioinformatics datasets learning. J Bioinform Comput Biol . 2012;10(04):1250003. doi:10.1142/S0219720012500035
45. Liu T, Wang Z. SOV_refine: A further refined definition of segment overlap score and its significance for protein structure similarity.Source Code Biol Med . 2018;13:1. doi:10.1186/s13029-018-0068-7
46. Stapor K, Ksieniewicz P, García S, Woźniak M. How to design the fair experimental classifier evaluation. Applied Soft Computing . 2021;104:107219. doi:10.1016/j.asoc.2021.107219
47. Cohen J. Statistical Power Analysis for the Behavioral Sciences . 2nd Edition. Routledge; 1988.
48. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . ; 2016:770-778. doi:10.1109/CVPR.2016.90
49. Hochreiter S, Schmidhuber J. Long short-term memory. Neural computation . 1997;9(8):1735-1780.
50. Wang X, Girshick R, Gupta A, He K. Non-local Neural Networks. In:2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . ; 2018:7794-7803. doi:10.1109/CVPR.2018.00813
51. Drozdzal M, Vorontsov E, Chartrand G, Kadoury S, Pal C. The Importance of Skip Connections in Biomedical Image Segmentation. In: Carneiro G, Mateus D, Peter L, et al., eds. Deep Learning and Data Labeling for Medical Applications . Lecture Notes in Computer Science. Springer International Publishing; 2016:179-187. doi:10.1007/978-3-319-46976-8_19
52. Weytjens H, De Weerdt J. Process Outcome Prediction: CNN vs. LSTM (with Attention). In: Del Río Ortega A, Leopold H, Santoro FM, eds.Business Process Management Workshops . Lecture Notes in Business Information Processing. Springer International Publishing; 2020:321-333. doi:10.1007/978-3-030-66498-5_24
53. Li B, Zhou E, Huang B, et al. Large scale recurrent neural network on GPU. In: 2014 International Joint Conference on Neural Networks (IJCNN) . ; 2014:4062-4069. doi:10.1109/IJCNN.2014.6889433
54. Jacoboni I, Martelli PL, Fariselli P, Compiani M, Casadio R. Predictions of protein segments with the same aminoacid sequence and different secondary structure: A benchmark for predictive methods.Proteins: Structure, Function, and Bioinformatics . 2000;41(4):535-544. doi:10.1002/1097-0134(20001201)41:4<535::AID-PROT100>3.0.CO;2-C
55. Saravanan KM, Selvaraj S. Performance of secondary structure prediction methods on proteins containing structurally ambivalent sequence fragments. Peptide Science . 2013;100(2):148-153. doi:10.1002/bip.22178
56. Ghozlane A, Joseph AP, Bornot A, de Brevern AG. Analysis of protein chameleon sequence characteristics. Bioinformation . 2009;3(9):367-369.
57. Kotowski K, Stapor K, Ochab J. Deep Learning Methods in Electroencephalography. In: Tsihrintzis GA, Jain LC, eds. Machine Learning Paradigms: Advances in Deep Learning-Based Technological Applications . Learning and Analytics in Intelligent Systems. Springer International Publishing; 2020:191-212. doi:10.1007/978-3-030-49724-8_8