Introduction

For almost all molecules, multiple geometrically-distinct conformers exist. Understanding and predicting thermodynamically accessible ensembles of molecular conformers is a key task underlying much of computational chemistry.\cite{Grimme_2004,Lodewyk_2011,Jackson_2014} In principle, for each rotatable bond, the number of possible minima increases exponentially. Consequently, most conformer sampling methods\cite{Hawkins_2017} use classical small-molecule force fields to evaluate energies because of their fast performance, despite potentially poor correlation with quantum mechanical methods.\cite{Kanal_2017}
Multiple efforts have evaluated the success of wavefunction and density functional first-principles methods to compare the energetics of different conformers.\cite{Habgood_2020,Sharapa_2018,Kesharwani_2015,_ez__2018,Prasad_2019,Kang_2018,Yuan_2014} While experimental crystal structures and bioactive docked conformers are not always the lowest energy conformer, recent efforts have demonstrated only small energy differences when using quantum chemical methods instead of force fields.\cite{Rai_2019,Foloppe_2019}
Even for simple molecules such as 1,1'-biphenyl, use of large basis set coupled cluster methods are needed to accurately place the dihedral angle and barrier.\cite{Johansson_2008} Other works have documented the need for accurate treatment of non-covalent interactions to model conformers in π-conjugated oligomers.\cite{Jackson_2013}
One common assumption is the presumed balance between increasing desired thermochemical accuracy and increased computational time. That is, more computationally intensive methods produce more accurate geometries and thermochemical properties. For example, the rise of composite ab initio thermochemical recipes such as G3,\cite{Curtiss_1998} G4,\cite{Curtiss_2007} and W1\cite{Martin_1999,Parthiban_2001} to W4\cite{Karton_2006} seeks to provide highly accurate thermochemical predictions by separate estimates of basis set extrapolation and electron correlation. Still, such methods are largely limited to small molecules due to the high computational cost.\cite{Ghahremanpour_2016} As mentioned above, efforts for conformer sampling have often focused on classical force fields or multi-level approaches using semiempirical methods.\cite{Hawkins_2017,Hawkins2010,Ju_rez_Jim_nez_2014}
In our previous paper,\cite{Kanal_2017} we considered both the single-point energies and geometry optimizations of a range of common computational chemistry methods, including classical force fields, semiempirical quantum chemistry, and dispersion-corrected density functional methods. In general, due to the large differences in the potential energy surfaces predicted by force fields and quantum methods, we found poor correlation between both single point energies at the same geometry and optimized geometries using different methods. 
In this work, in order to expand our range of computational methods, we only consider the relative single  point energies from the same set of density-functional optimized geometries, comparing multiple current methods to a high-quality coupled cluster baseline. We consider the mean absolute relative errors in energies (MARE), as well as the correlation of relative energies, reflected in the R2 coefficient of determination, and the ranking of single-point energies reflected in the Spearman ρ correlation. The use of correlation coefficients and the Spearman correlation intend to consider whether methods exhibit systematic errors that may not affect linear correlation or ranking of energetic stabilities. 
While we find increased accuracy typically still requires exponential increases in computational time, several methods stand out as widely useful methods for ranking conformer energies. Future improvements in standard computational methods and machine learning surrogates suggest that both increased accuracy and efficiency are expected from further method development.

Computational Methods

Calculations were performed using Open Babel version 3.0\cite{21982300} for all force field calculations (MMFF94\cite{Halgren:1996bm,Halgren:1996kn,Halgren:1996ew,Halgren:1996hj,Halgren:1996ux} and UFF\cite{Rappe_1992,Casewit_1992}OpenMOPAC for PM7,\cite{Stewart_2012} xtb version 6.2\cite{grimme-labxtb} for GFN0\cite{Pracht_2019} GFN1\cite{Grimme2017} and GFN2 calculations,\cite{Bannwarth_2018} and Orca 4.0.1\cite{Neese_2011} for all density functional and ab initio calculations, unless otherwise indicated. For density functional methods, the D3(BJ)\cite{Grimme_2011,Becke_2005,Johnson_2005,Johnson_2006} dispersion correction scheme was used as indicated, except for \(\omega\)B97X-D3\cite{Chai_2008} which uses a similar approach. For ab initio methods, Orca 4.0.1 was used for MP2\cite{Kossmann_2010} and DLPNO-CCSD(T)\cite{Liakos_2015,Guo_2018} with "TightPNO" using the cc-pVTZ basis set.\cite{Dunning_1989,Kendall_1992} Energies are read from all output files using the cclib\cite{O_boyle_2008} version 1.6.2, and pybel version 3.0.\cite{O_Boyle_2008}
Machine learning methods included “bag-of-features” representations and ANI-1x\cite{Smith_2018}, ANI-1ccx\cite{S_Smith_2019}, and ANI-2x\cite{Devereux_2020} models. The Bag-of-Features representations chosen were Bag of Bonds\cite{Hansen_2015}(BOB),  Bond Angle Torsion \cite{Huang_2016}(BAT), and Bond Angle Torsion Typed (BATTY). BOB represents atoms and pair-wise interactions into sorted bags with BAT being a many-body expansion to include angles and torsions. Both of these representations were implemented using chemreps.\cite{chemreps} The BATTY representation takes inspiration from BAT in order to include minimal atom typing in all bond, angle, and torsion bags while excluding nonbonding interaction and nuclear charge bags in the final representation, as discussed below. scikit-learn\cite{scikit-learn} was used for kernel ridge regression of Bag-of-Features representations.
For this work, all timings are single-core CPU times using a 2.60 GHz Intel Skylake CPU (Intel Xeon Gold 6126) with 192GB RAM per node, through the University Pittsburgh Center for Research Computing.
Python scripts and Jupyter notebooks were used to compile all data into pandas\cite{mckinney-proc-scipy-2010}  data frames, using numpy\cite{van_der_Walt_2011} and scipy\cite{Virtanen_2020} functions for analysis. 3DMol.js was used for interactive molecular visualization of conformers.\cite{Rego_2014} Plotly was used for interactive plots.\cite{plotly} 
All scripts and data, including molecular geometries, are provided through GitHub (https://github.com/ghutchis/conformer-benchmark) with the intent that additional computational methods can be added to these benchmark comparisons.

Test Set Selection