As illustrated in Figure \ref{423464}, each method is correlated with DLPNO-CCSD(T) / cc-pVTZ energies for each molecule (e.g., astex_1hwi in Figure \ref{423464}). Since each molecule has several conformers, three metrics are compiled, the mean absolute relative energy (MARE) compared to the DLPNO-CCSD(T) atomization energies, the Pearson R2 correlation, and the Spearman correlation ρ. The MARE metric gives an absolute measure of the energetic errors, but since different methods use different energy scales (e.g., heats of formation for PM7 and force fields), the statistical correlations use linear regression (R2) and relative ordering (Spearman ρ) to remove sources of systematic energy differences. For each metric across each method, the median value was compiled as illustrated in Figure \ref{423464}, to represent the overall quality of a given method.
Since the metrics are unlikely to reflect normal distributions (e.g., Figure \ref{423464} shows highly non-Gaussian distributions), determining confidence intervals cannot be established from analytical formulas. Consequently, we used bootstrap sampling to establish 95% confidence values for the medians, as reported below. For ease of discussion, we have given the confidence ranges in all tables and figures, but indicate ± errors using the average of the upper and lower bounds. In general, the asymmetry between upper and lower bounds are smalll.
By considering a large number of diverse organic molecules with many poses per molecule, we seek to sample a wide variety of conformer energy preferences (e.g., intramolecular hydrogen and halogen bonding, π-π stacking, electrostatic interactions, etc.). While using optimized low-energy conformers may under-estimate the accuracy of methods for high-energy structures,\cite{Sharapa_2018} we believe the current work is a challenging but useful comparison. In general, such high-energy geometries reflect steric repulsion more than the diverse types of interactions driving low-energy geometries. 
Moreover, many computational predictions rely on Boltzmann-weighted averages of multiple thermally accessible conformers, including NMR  prediction,\cite{Lodewyk_2011,Grimme_2004} reactions, and even understanding the effects of dipole moments on solvent viscosity.\cite{Vo_2019} Consequently, deriving accurate relative energies of molecular conformers is a crucial task, as discussed below.

Comparison of single points vs. DLPNO-CCSD(T)

For comparison, we considered a wide variety of currently available computational methods:
In the case of B3LYP and PBE dispersion-corrected functionals, we also considered both the commonly-used double-zeta def2-SVP and triple-zeta def2-TZVP basis sets to understand the effects of basis set size. For B3LYP, PBE, and ωB97X, we also considered the accuracy with and without dispersion correction.