There were no statistically significant differences between ProteinUnetLM and SPOT-1D-LM in terms of SOV8, but in most cases (excluding TEST2020-HQ) ProteinUnetLM had a better mean and smaller standard deviation (SD). The only advantage of ProtT5Sec over ProteinUnetLM was a correct prediction of the rarest structure “I” that highly improved the macro-AGM at the residue level for TEST2018 and TEST2020. ProteinUnetLM was better than ProtT5Sec in all other aspects. ProteinUnetLM was not statistically significantly worse than competitors in any metric or dataset. The competitive results of ProteinUnetLM on Neff1-2020 (sequences without homologs) and CASP12-FM (free modeling targets) prove the abilities of the network to generalize well beyond the protein folds included in the training/validation sets

Comparison on CASP14

In the context of the recent success of AlphaFold2 in the CASP14 contest28, it is necessary to compare our network with SS8 predictions derived from AlphaFold2 tertiary structures (using DSSP) submitted to that contest, in order to support the desirability of our work. This comparison is far from being fair as AlphaFold2 is a much bigger model trained on a much bigger dataset. Despite this, ProteinUnetLM was able to achieve better macro-AGM for 10 out of 30 sequences from the CASP14 dataset (Supplementary Table S4) with better residue level AGM for the rare class G (Supplementary Table S5) and is not statistically significantly different than AlphaFold2 in that metric (Table 3). It supports the claim that ProteinUnetLM provides state-of-the-art results in terms of the AGM metric. AlphaFold2 dominated other metrics and structures. It has a much better SOV8 which confirms the abilities of this metric to evaluate the quality of tertiary structure prediction at the secondary structure level45, and a much higher Q8.
Setting aside AlphaFold2, ProteinUnetLM dominated all other networks in terms of macro-AGM at the sequence level with relatively large effect sizes (d > 0.3) and achieved the highest SOV8 (statistically significantly better than ProteinUnet2 and NetSurfP-3.0). As for TEST2018 and TEST2020, ProtT5Sec was able to predict the rarest structure “I” (Supplementary Table S5), so it surpassed ProteinUnetLM in macro-AGM at the residue level; it turns out to be one of the most prominent features of the ProtT5Sec network. In terms of Q8, ProteinUnetLM only gave way to SPOT-1D-LM
Table 3 . The comparison of macro-AGM at thesequence and residue level , SOV8 at thesequence level, and Q8 at the residue levelon CASP14 for ProteinUnetLM vs all other networks. The best results for each metric are boldfaced and the second best areunderlined . The green/red shading of sequence level scores denotes the statistical significance that ProteinUnetLM has a better/worse mean with standard deviations (SD), p-values, and Cohen’s effect size (d) given below the score.