References
1. Kabsch W, Sander C. Dictionary of protein secondary structure:
Pattern recognition of hydrogen‐bonded and geometrical features.Biopolymers . 1983;22:2577-2637.
2. Liu J, Rost B. Comparing function and structure between entire
proteomes. Protein Science . 2001;10(10):1970-1979.
doi:10.1110/ps.10101
3. McGuffin LJ, Bryson K, Jones DT. What are the baselines for protein
fold recognition? Bioinformatics . 2001;17(1):63-72.
doi:10.1093/bioinformatics/17.1.63
4. Meiler J, Baker D. Coupled prediction of protein secondary and
tertiary structure. Proc Natl Acad Sci U S A .
2003;100(21):12105-12110. doi:10.1073/pnas.1831973100
5. Hung LH, Samudrala R. PROTINFO: secondary and tertiary protein
structure prediction. Nucleic Acids Research .
2003;31(13):3296-3299. doi:10.1093/nar/gkg541
6. Myers JK, Oas TG. Preorganized secondary structure as an important
determinant of fast protein folding. Nat Struct Mol Biol .
2001;8(6):552-558. doi:10.1038/88626
7. Gardy JL, Spencer C, Wang K, et al. PSORT-B: improving protein
subcellular localization prediction for Gram-negative bacteria.Nucleic Acids Research . 2003;31(13):3613-3617.
doi:10.1093/nar/gkg602
8. Van Domselaar GH, Stothard P, Shrivastava S, et al. BASys: a web
server for automated bacterial genome annotation. Nucleic Acids
Research . 2005;33(suppl_2):W455-W459. doi:10.1093/nar/gki593
9. Mewes HW, Frishman D, Mayer KFX, et al. MIPS: analysis and annotation
of proteins from whole genomes in 2005. Nucleic Acids Research .
2006;34(suppl_1):D169-D172. doi:10.1093/nar/gkj148
10. Wishart DS, Case DA. Use of Chemical Shifts in Macromolecular
Structure Determination. In: James TL, Dötsch V, Schmitz U, eds.Methods in Enzymology . Vol 338. Nuclear Magnetic Resonance of
Biological Macromolecules Part A. Academic Press; 2002:3-34.
doi:10.1016/S0076-6879(02)38214-4
11. Smolarczyk T, Roterman-Konieczna I, Stapor K. Protein Secondary
Structure Prediction: A Review of Progress and Directions. Current
Bioinformatics . 2020;15:90-107.
12. Chou PY, Fasman GD. Prediction of protein conformation.Biochemistry . 1974;13:222-245.
13. Garnier J, Osguthorpe DJ, Robson B. Analysis of the accuracy and
implications of simple methods for predicting the secondary structure of
globular proteins. Journal of Molecular Biology . 1978;120:97-120.
14. Lim VI. Algorithms for prediction of α-helical and β-structural
regions in globular proteins. Journal of Molecular Biology .
1974;88:873-894.
15. Jiang Q, Jin X, Lee SJ, Yao S. Protein secondary structure
prediction: A survey of the state of the art. Journal of Molecular
Graphics and Modelling . 2017;76:379-402. 10.1016/j.jmgm.2017.07.015.
16. Rost B, Sander C. Improved prediction of protein secondary structure
by use of sequence profiles and neural networks. Proceedings of
the National Academy of Sciences of the United States of America .
1993;90:7558-7562.
17. Remmert M, Biegert A, Hauser A, Söding J. HHblits: lightning-fast
iterative protein sequence searching by HMM-HMM alignment. Nature
Methods . 2012;9:173-175.
18. Jones DT. Protein secondary structure prediction based on
position-specific scoring matrices. Journal of Molecular Biology .
1999;292:195-202.
19. Klausen MS, Jespersen MC, Nielsen H, Jensen KK, Jurtz VI.
NetSurfP-2.0: Improved prediction of protein structural features by
integrated deep learning. Proteins: Structure, Function, and
Bioinformatics . 2019;87:520-527.
20. Hanson J, Paliwal K, Litfin T, Yang Y, Zhou Y. Improving prediction
of protein secondary structure, backbone angles, solvent accessibility
and contact numbers by using predicted contact maps and an ensemble of
recurrent and residual convolutional neural networks. Valencia A, ed.Bioinformatics . 2019;35(14):2403-2410.
doi:10.1093/bioinformatics/bty1006
21. Ofer D, Brandes N, Linial M. The language of proteins: NLP, machine
learning & protein sequences. Computational and Structural
Biotechnology Journal . 2021;19:1750-1758.
doi:10.1016/j.csbj.2021.03.022
22. Heinzinger M, Elnaggar A, Wang Y, et al. Modeling aspects of the
language of life through transfer-learning protein sequences. BMC
Bioinformatics . 2019;20(1):723. doi:10.1186/s12859-019-3220-8
23. Elnaggar A, Heinzinger M, Dallago C, et al. ProtTrans: Towards
Cracking the Language of Lifes Code Through Self-Supervised Deep
Learning and High Performance Computing. IEEE Trans Pattern Anal
Mach Intell . Published online 2021:1-1. doi:10.1109/TPAMI.2021.3095381
24. Rives A, Meier J, Sercu T, et al. Biological structure and function
emerge from scaling unsupervised learning to 250 million protein
sequences. Proc Natl Acad Sci USA . 2021;118(15):e2016239118.
doi:10.1073/pnas.2016239118
25. Vig J, Madani A, Varshney LR, Xiong C, socher richard, Rajani N.
{BERT}ology Meets Biology: Interpreting Attention in Protein Language
Models. In: International Conference on Learning Representations .
; 2021. https://openreview.net/forum?id=YWtLZvLmud7
26. Høie MH, Kiehl EN, Petersen B, et al. NetSurfP-3.0: accurate and
fast prediction of protein structural features by protein language
models and deep learning. Nucleic Acids Research .
2022;50(W1):W510-W515. doi:10.1093/nar/gkac439
27. Singh J, Paliwal K, Litfin T, Singh J, Zhou Y. Reaching
alignment-profile-based accuracy in predicting protein secondary and
tertiary structural properties without alignment. Sci Rep .
2022;12(1):7607. doi:10.1038/s41598-022-11684-w
28. Jumper J, Evans R, Pritzel A, et al. Highly accurate protein
structure prediction with AlphaFold. Nature .
2021;596(7873):583-589. doi:10.1038/s41586-021-03819-2
29. Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical
assessment of methods of protein structure prediction (CASP)—Round
XIV. Proteins: Structure, Function, and Bioinformatics .
2021;89(12):1607-1617. doi:https://doi.org/10.1002/prot.26237
30. Stevens AO, He Y. Benchmarking the Accuracy of AlphaFold 2 in Loop
Structure Prediction. Biomolecules . 2022;12(7):985.
doi:10.3390/biom12070985
31. Stapor K, Kotowski K, Smolarczyk T, Roterman I. Lightweight
ProteinUnet2 network for protein secondary structure prediction: a step
towards proper evaluation. BMC Bioinformatics . 2022;23(1):100.
doi:10.1186/s12859-022-04623-z
32. AlQuraishi M. ProteinNet: a standardized data set for machine
learning of protein structure. BMC Bioinformatics .
2019;20(1):311. doi:10.1186/s12859-019-2932-0
33. Isensee F, Jaeger PF, Kohl SAA, Petersen J, Maier-Hein KH. nnU-Net:
a self-configuring method for deep learning-based biomedical image
segmentation. Nat Methods . 2021;18(2):203-211.
doi:10.1038/s41592-020-01008-z
34. Kotowski K, Adamski S, Machura B, Zarudzki L, Nalepa J. Coupling
nnU-Nets with Expert Knowledge for Accurate Brain Tumor Segmentation
from MRI. In: Crimi A, Bakas S, eds. Brainlesion: Glioma, Multiple
Sclerosis, Stroke and Traumatic Brain Injuries . Lecture Notes in
Computer Science. Springer International Publishing; 2022:197-209.
doi:10.1007/978-3-031-09002-8_18
35. Isensee F, Ulrich C, Wald T, Maier-Hein KH. Extending nnU-Net is all
you need. Published online August 23, 2022.
doi:10.48550/arXiv.2208.10791
36. Kotowski K, Smolarczyk T, Roterman‐Konieczna I, Stapor K.
ProteinUnet—An efficient alternative to SPIDER3-single for
sequence-based prediction of protein secondary structures. Journal
of Computational Chemistry . 2021;42(1):50-59.
doi:https://doi.org/10.1002/jcc.26432
37. Vaswani A, Shazeer N, Parmar N, et al. Attention is All you Need.
In: Advances in Neural Information Processing Systems . Vol 30.
Curran Associates, Inc.; 2017. Accessed September 14, 2022.
https://papers.nips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
38. Oktay O, Schlemper J, Folgoc LL, et al. Attention U-Net: Learning
Where to Look for the Pancreas. In: Medical Imaging with Deep
Learning . ; 2018. https://openreview.net/forum?id=Skft7cijM
39. Rao R, Meier J, Sercu T, Ovchinnikov S, Rives A. Transformer protein
language models are unsupervised structure learners. Published online
December 15, 2020:2020.12.15.422761. doi:10.1101/2020.12.15.422761
40. Chicco D, Jurman G. The advantages of the Matthews correlation
coefficient (MCC) over F1 score and accuracy in binary classification
evaluation. BMC Genomics . 2020;21(1):6.
doi:10.1186/s12864-019-6413-7
41. Chicco D, Tötsch N, Jurman G. The Matthews correlation coefficient
(MCC) is more reliable than balanced accuracy, bookmaker informedness,
and markedness in two-class confusion matrix evaluation. BioData
Mining . 2021;14(1):13. doi:10.1186/s13040-021-00244-z
42. Abhishek K, Hamarneh G. Matthews Correlation Coefficient Loss for
Deep Convolutional Networks: Application to Skin Lesion Segmentation.
Published online February 20, 2021. Accessed September 18, 2022.
http://arxiv.org/abs/2010.13454
43. Kingma DP, Ba J. Adam: A Method for Stochastic Optimization. In:
Bengio Y, LeCun Y, eds. 3rd International Conference on Learning
Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015,
Conference Track Proceedings . ; 2015. http://arxiv.org/abs/1412.6980
44. Batuwita R, Palade V. Adjusted geometric-mean: a novel performance
measure for imbalanced bioinformatics datasets learning. J
Bioinform Comput Biol . 2012;10(04):1250003.
doi:10.1142/S0219720012500035
45. Liu T, Wang Z. SOV_refine: A further refined definition of segment
overlap score and its significance for protein structure similarity.Source Code Biol Med . 2018;13:1. doi:10.1186/s13029-018-0068-7
46. Stapor K, Ksieniewicz P, García S, Woźniak M. How to design the fair
experimental classifier evaluation. Applied Soft Computing .
2021;104:107219. doi:10.1016/j.asoc.2021.107219
47. Cohen J. Statistical Power Analysis for the Behavioral
Sciences . 2nd Edition. Routledge; 1988.
48. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image
Recognition. In: 2016 IEEE Conference on Computer Vision and
Pattern Recognition (CVPR) . ; 2016:770-778. doi:10.1109/CVPR.2016.90
49. Hochreiter S, Schmidhuber J. Long short-term memory. Neural
computation . 1997;9(8):1735-1780.
50. Wang X, Girshick R, Gupta A, He K. Non-local Neural Networks. In:2018 IEEE/CVF Conference on Computer Vision and Pattern
Recognition . ; 2018:7794-7803. doi:10.1109/CVPR.2018.00813
51. Drozdzal M, Vorontsov E, Chartrand G, Kadoury S, Pal C. The
Importance of Skip Connections in Biomedical Image Segmentation. In:
Carneiro G, Mateus D, Peter L, et al., eds. Deep Learning and Data
Labeling for Medical Applications . Lecture Notes in Computer Science.
Springer International Publishing; 2016:179-187.
doi:10.1007/978-3-319-46976-8_19
52. Weytjens H, De Weerdt J. Process Outcome Prediction: CNN vs. LSTM
(with Attention). In: Del Río Ortega A, Leopold H, Santoro FM, eds.Business Process Management Workshops . Lecture Notes in Business
Information Processing. Springer International Publishing; 2020:321-333.
doi:10.1007/978-3-030-66498-5_24
53. Li B, Zhou E, Huang B, et al. Large scale recurrent neural network
on GPU. In: 2014 International Joint Conference on Neural Networks
(IJCNN) . ; 2014:4062-4069. doi:10.1109/IJCNN.2014.6889433
54. Jacoboni I, Martelli PL, Fariselli P, Compiani M, Casadio R.
Predictions of protein segments with the same aminoacid sequence and
different secondary structure: A benchmark for predictive methods.Proteins: Structure, Function, and Bioinformatics .
2000;41(4):535-544.
doi:10.1002/1097-0134(20001201)41:4<535::AID-PROT100>3.0.CO;2-C
55. Saravanan KM, Selvaraj S. Performance of secondary structure
prediction methods on proteins containing structurally ambivalent
sequence fragments. Peptide Science . 2013;100(2):148-153.
doi:10.1002/bip.22178
56. Ghozlane A, Joseph AP, Bornot A, de Brevern AG. Analysis of protein
chameleon sequence characteristics. Bioinformation .
2009;3(9):367-369.
57. Kotowski K, Stapor K, Ochab J. Deep Learning Methods in
Electroencephalography. In: Tsihrintzis GA, Jain LC, eds. Machine
Learning Paradigms: Advances in Deep Learning-Based Technological
Applications . Learning and Analytics in Intelligent Systems. Springer
International Publishing; 2020:191-212. doi:10.1007/978-3-030-49724-8_8