Correlation between mean absolute relative energies (MARE) and median R2 correlation. Since the R2 metric minimizes systematic errors, the high degree of correlation between the two metrics indicate most methods exhibit relatively random / non-systematic errors. Error bars indicate 95% confidence intervals from bootstrap sampling.