Results
Comparison with MSA-based
classifiers
First, we directly compared ProteinUnetLM with its previous version
ProteinUnet2 based on multiple sequence alignment (MSA) features
(https://codeocean.com/capsule/0425426), and its competitor
SPOT-1D network 20(https://sparks-lab.org/server/spot-1d/). ProteinUnetLM achieved
the best results in 7 out of 8 combinations of the test set (TEST2016
and TEST2018) and metric (macro-AGM at residue and sequence levels,
SOV8, and Q8 at the residue level) presented in Table 1. In terms of
statistical significance, ProteinUnetLM had statistically significantly
better macro-AGM in comparison to ProteinUnet2 (p=0.0054 on TEST2016)
and SPOT-1D (p=2e-5 on TEST2016 and p=0.0077 on TEST2018) with small
effect sizes (d < 0.2). There were no large or significant
differences between networks for the SOV8 metric. ProteinUnetLM achieved
the highest Q8 among all networks.
The separate AGM scores for each SS8 in Supplementary Table S2 show that
ProteinUnetLM is better than ProteinUnet2 for all structures on the
biggest TEST2016 and TEST2018 datasets, especially on the rare
structures B, G, S, and I. Importantly, ProteinUnetLM achieved correct
predictions for the rarest structure “I” which was not possible using
MSA features in ProteinUnet2. It confirms that LMs provide better
features for protein SS than MSA-based methods like PSSM or HHblits.
Especially, taking into account the fact that ProteinUnetLM is a single
model, not an ensemble of 10 models like ProteinUnet2.
Table 1 . The comparison of macro-AGM at thesequence and residue level , SOV8 at thesequence level, and Q8 at the residue levelon two test sets for ProteinUnetLM vs ProteinUnet2 and SPOT-1D. The best
results for each dataset are boldfaced. The green shading of sequence
level scores denotes the statistical significance that ProteinUnetLM has
a better mean with standard deviations (SD), p-values, and Cohen’s
effect size (d) given below the score.