Search Results for "benchmark"

More DFT benchmarking

Selecting the appropriate density functional for one’s molecular system at hand is often a very confounding problem, especially for non-expert or first-time users of computational chemistry. The DFT zoo is vast and confusing, and perhaps what makes the situation worse is that there is no lack of benchmarking studies. For example, I have made more than 30 posts on benchmark studies, and I made no attempt to be comprehensive over the past dozen years!

One such benchmark study that I missed was presented by Mardirossian and Head-Gordon in 2017.1 They evaluated 200 density functional using the MGCDB84 database, a combination of data from a number of different groups. They make a series of recommendations for local GGA, local meta-GGA, hybrid GGA, and hybrid meta-GGA functionals. And when pressed to choose just one functional overall, they opt for ωB97M-V, a range-separated hybrid meta-GGA with VV10 nonlocal correlation.

Georigk and Mehta2 just recently offer a review of the density functional zoo. Leaning heavily on benchmark studies using the GMTKN553 database, they report a number of observations. Of no surprise to readers of this blog, their main conclusion is that accounting for London dispersion is essential, usually through some type of correction like those proposed by Grimme.

These authors also note the general disparity between the most accurate, best performing functional per the benchmark studies and the results of the DFT poll conducted for many years by Swart, Bickelhaupt and Duran. It is somewhat remarkable that PBE or PBE0 have topped the poll for many years, despite the fact that many newer functionals perform better. As always, when choosing a functional caveat emptor.


1.  Mardirossian, N.; Head-Gordon, M., “Thirty years of density functional theory in computational chemistry: an overview and extensive assessment of 200 density functionals.” Mol. Phys. 2017, 115, 2315-2372, DOI: 10.1080/00268976.2017.1333644.

2. Goerigk, L.; Mehta, N., “A Trip to the Density Functional Theory Zoo: Warnings and Recommendations for the User.” Aust. J. Chem. 2019, ASAP, DOI: 10.1071/CH19023.

3. Goerigk, L.; Hansen, A.; Bauer, C.; Ehrlich, S.; Najibi, A.; Grimme, S., “A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions.” Phys. Chem. Chem. Phys. 2017, 19, 32184-32215, DOI: 10.1039/C7CP04913G.

DFT Steven Bachrach 18 Mar 2019 No Comments

Benchmarking Platonic solids and related hydrocarbons

Karton, Schreiner, and Martin have benchmarked the heats of formation of some Platonic Solids and related hydrocarbons.1 The molecules examined are tetrahedrane 1, cubane 2, dodecahedrane 3, trisprismane 4, pentaprismane 5, and octahedrane 6.

The optimized structures (B3LYP-D3BJ/def2-TZVPP) of these compounds are shown in Figure 1.







Figure 1. B3LYP-D3BJ/def2-TZVPP optimized geometries of 1-6.

Using the W1-F12 and W2-F12 composite methods, the estimated the heats of formation of these hydrocarbons are listed in Table 1. Experimental values are available only for 2 and 3; the computed values are off by about 2 kcal mol-1, which the authors argue is just outside the error bars of the computations. They suggest that the experiments might need to be revisited.

Table 1. Heats of formation (kcal mol-1) of 1-6.


ΔHf (comp)

ΔHf (expt)






142.7 ± 1.2



22.4 ± 1










They conclude with a comparison of strain energies computed using isogyric, isodesmic, and homodesmotic reactions with a variety of computational methods. Somewhat disappointingly, most DFT methods have appreciable errors compared with the W1-F12 results, and the errors vary depend on the chemical reaction employed. However, the double hybrid method DSD-PBEP86-D3BJ consistently reproduces the W1-F12 results.


(1)  Karton, A.; Schreiner, P. R.; Martin, J. M. L. "Heats of formation of platonic hydrocarbon cages by means of high-level thermochemical procedures," J. Comput. Chem. 2016, 37, 49-58, DOI: 10.1002/jcc.23963.


1: InChI=1S/C4H4/c1-2-3(1)4(1)2/h1-4H

2: InChI=1S/C8H8/c1-2-5-3(1)7-4(1)6(2)8(5)7/h1-8H

3: InChI=1S/C20H20/c1-2-5-7-3(1)9-10-4(1)8-6(2)12-11(5)17-13(7)15(9)19-16(10)14(8)18(12)20(17)19/h1-20H

4: InChI=1S/C6H6/c1-2-3(1)6-4(1)5(2)6/h1-6H

5: InChI=1S/C10H10/c1-2-5-3(1)7-8-4(1)6(2)10(8)9(5)7/h1-10H

6: InChI=1S/C12H12/c1-2-4-6-5(11-7(1)10(4)11)3(1)9-8(2)12(6)9/h1-12H

QM Method &Schreiner Steven Bachrach 10 May 2016 No Comments

Keto-enol Benchmark Study

The keto-enol tautomerization is a fundamental concept in organic chemistry, taught in the introductory college course. As such, it provides an excellent test reaction to benchmark the performance computational methods. Acevedo and colleagues have reported just such a benchmark study.1

First, the compare a wide variety of methods, ranging from semi-empirical, to DFT, and to composite procedures, with experimental gas-phase free energy of tautomerization. They use seven such examples, two of which are shown in Scheme 1. The best results from each computation category are AM1, with a mean absolute error (MAE) of 1.73 kcal mol-1, M06/6-31+G(d,p), with a MAE of 0.71 kcal mol-1, and G4, with a MAE of 0.95 kcal mol-1. All of the modern functionals do a fairly good job, with MAEs less than 1.3 kcal mol-1.

Scheme 1

As might be expected, the errors were appreciably larger for predicting the free energy of tautomerization, with a good spread of errors depending on the method for handling solvent (PCM, CPCM, SMD) and the choice of cavity radius. The best results were with the G4/PCM/UA0 procedure, though M06/6-31+G(d,p)/PCM and either UA0 or UFF performed quite well, at considerably less computational expense.


(1) McCann, B. W.; McFarland, S.; Acevedo, O. "Benchmarking Continuum Solvent Models for Keto–Enol Tautomerizations," J. Phys. Chem. A 2015, 119, 8724-8733, DOI: 10.1021/acs.jpca.5b04116.

Keto-enol tautomerization &QM Method Steven Bachrach 12 Oct 2015 No Comments

Benchmarking π-conjugation

With the proliferation of density functionals, selecting the functional to use in your particular application requires some care. That is why there have been quite a number of benchmark studies (see these posts for some examples). Yu and Karton have now added to our benchmark catalog with a study of π-conjugation.1

They looked at a set of 60 reactions which involve a reactant with π-conjugation and a product which lacks conjugation. A few examples, showing examples involving linear and cyclic systems, are shown in Scheme 1.

Scheme 1.

The reaction energies were evaluated at W2-F12, which should have an error of a fraction of a kcal mol-1. Three of the reactions can be compared with experimental values, and difference in the experimental and computed values are well within the error bars of the experiment. It is too bad that the authors did not also examine 1,3-cyclohexadiene → 1,4-cyclohexadiene, a reaction that is both of broader interest than many of the ones included in the test set and can also be compared with experiment.

These 60 reactions were then evaluated with a slew of functionals from every rung of Jacob’s ladder. The highlights of this benchmark study are that most GGA and meta-GGA and hybrid functionals (like B3LYP) have errors that exceed chemical accuracy (about 1 kcal mol-1). However, the range-separated functionals give very good energies, including ωB97X-D. The best results are provided with double hybrid functionals. Lastly, the D3 dispersion correction does generally improve energies by 10-20%. On the wavefunction side, SCS-MPs gives excellent results, and may be one of the best choices when considering computational resources.


(1) Yu, L.-J.; Karton, A. "Assessment of theoretical procedures for a diverse set of isomerization reactions involving double-bond migration in conjugated dienes," Chem. Phys. 2014, 441, 166-177, DOI: 10.1016/j.chemphys.2014.07.015.

DFT Steven Bachrach 14 Jan 2015 2 Comments

Benchmarking conformations: melatonin

Conformational analysis is one of the tasks that computation chemistry is typically quite adept at and computational chemistry is frequently employed for this purpose. Thus, benchmarking methods for their ability to predict accurate conformation energies is quite important. Martin has done this for alkanes1 (see this post), and now he has looked at a molecule that contains weak intramolecular hydrogen bonds. He examined 52 conformations of melatonin 1.2 The structures of the two lowest energy conformations are shown in Figure 1.




Figure 1. Structures of the two lowest energy conformers of 1 at SCS-MP2/cc-pVTZ.

The benchmark (i.e. accurate) relative energies of these conformers were obtained at MP2-F12/cc-pVTZ-F12 with a correction for the role of triples: (ECCSD(T)/cc-pVTZ)-E(MP2/cc-pVTZ)). The energies of the conformers were computed with a broad variety of basis sets and quantum methodologies. The root mean square deviation from the benchmark energies is used as a measure of the utility of these alternate methodologies. Of particular note is that HF predicts the wrong ordering of the two lowest energy isomers, as do some DFT methods that use small basis sets and do not incorporate dispersion.

In fact, other than the M06 family or double hybrid functionals, all of the functionals examined here (PBE. BLYP, PBE0, B3LYP, TPSS0 and TPSS) have RMSD values greater than 1 kcal mol-1. However, inclusion of a dispersion correction, Grimme’s D2 or D3 variety or the Vydrov-van Voorhis (VV10) non-local correction (see this post for a review of dispersion corrections), reduces the error substantially. Among the best performing functionals are B2GP-PLYP-D3, TPSS0-D3, DSD-BLYP and M06-2x. They also find the MP2.5 method to be a practical ab initio alternative. One decidedly unfortunate result is that large basis sets are needed; DZ basis sets are simply unacceptable, and truly accurate performance requires a QZ basis set.


(1) Gruzman, D.; Karton, A.; Martin, J. M. L. "Performance of Ab Initio and Density Functional Methods for Conformational Equilibria of CnH2n+2 Alkane Isomers (n = 4-8)," J. Phys. Chem. A 2009, 113, 11974–11983, DOI: 10.1021/jp903640h.

(2) Fogueri, U. R.; Kozuch, S.; Karton, A.; Martin, J. M. L. "The Melatonin Conformer Space: Benchmark and Assessment of Wave Function and DFT Methods for a Paradigmatic Biological and
Pharmacological Molecule," J. Phys. Chem. A 2013, 117, 2269-2277, DOI: 10.1021/jp312644t.


1: InChI=1S/C13H16N2O2/c1-9(16)14-6-5-10-8-15-13-4-3-11(17-2)7-12(10)13/h3-4,7-8,15H,5-6H2,1-2H3,(H,14,16)

DFT &MP Steven Bachrach 11 Apr 2013 2 Comments

Benchmarked Dispersion corrected DFT and SM12

This is a short post mainly to bring to the reader’s attention a couple of recent JCTC papers.

The first is a benchmark study by Hujo and Grimme of the geometries produced by DFT computations that are corrected for dispersion.1 They use the S22 and S66 test sets that span a range of compounds expressing weak interactions. Of particular note is that the B3LYP-D3 method provided the best geometries, suggesting that this much (and justly) maligned functional can be significantly improved with just the simple D3 fix.

The second paper entails the description of Truhlar and Cramer’s latest iteration on their solvation model, namely SM12.2 The main change here is the use of Hirshfeld-based charges, which comprise their Charge Model 5 (CM5). The training set used to obtain the needed parameters is much larger than with previous versions and allows for treating a very broad set of solvents. Performance of the model is excellent.


(1) Hujo, W.; Grimme, S. "Performance of Non-Local and Atom-Pairwise Dispersion Corrections to DFT for Structural Parameters of Molecules with Noncovalent Interactions," J. Chem. Theor. Comput. 2013, 9, 308-315, DOI: 10.1021/ct300813c

(2) Marenich, A. V.; Cramer, C. J.; Truhlar, D. G. "Generalized Born Solvation Model SM12," J. Chem. Theor. Comput. 2013, 9, 609-620, DOI: 10.1021/ct300900e

Cramer &DFT &Grimme &Solvation &Truhlar Steven Bachrach 14 Jan 2013 No Comments

Monosaccharides benchmark

A comprehensive evaluation of how different computational methods perform in predicting the energies of monosaccharides comes to some very interesting conclusions. Sameera and Pantazis1 have examined the eight different aldohexoses (allose, alltrose, glucose, mannose, gulose, idose, galactose and talose), specifically looking at different rotomers of the hydroxymethyl group, α- vs. β-anomers, pyranose vs. furanose isomers, ring conformations (1C4 vs skew boat forms), and ring vs. open chain isomers. In total, 58 different structures were examined. The benchmark computations are CCSD(T)/CBS single point energies using the SCS-MP2/def2-TZVPP optimized geometries. The RMS deviation from these benchmark energies for some of the many different methods examined are listed in Table 1.

Table 1. Average RMS errors (kJ mol-1) of the 58 different monosaccharide structures for
different computational methods.


average RMS error























Perhaps the most interesting take-home message is that CEPA, MP2, the double hybrid methods and M06-2x all do a very good job at evaluating the energies of the carbohydrates. Given the significant computational advantage of M06-2x over these other methods, this seems to be the functional of choice! The poorer performance of the DFT methods over the ab initio methods is primarily in the relative energies of the open-chain isomers, where errors can be on the order of 10-20 kJ mol-1 with most of the functionals; even the best overall methods (M06-2x and the double hybrids) have errors in the relative energies of the open-chain isomers of 7 kJ mol-1. This might be an area of further functional development to probe better treatment of the open-chain aldehydes vs. the ring hemiacetals.


(1) Sameera, W. M. C.; Pantazis, D. A. "A Hierarchy of Methods for the Energetically Accurate Modeling of Isomerism in Monosaccharides," J. Chem. Theory Comput. 2012, 8, 2630-2645, DOI:10.1021/ct3002305

DFT &sugars Steven Bachrach 28 Nov 2012 No Comments

Benchmarking DFT for the aldol and Mannich Reactions

Houk has performed a very nice examination of the performance of some density functionals.1 He takes a quite different approach than what was proposed by Grimme – the “mindless” benchmarking2 using random molecules (see this post). Rather, Houk examined a series of simple aldol, Mannich and α-aminoxylation reactions, comparing their reaction energies predicted with DFT against that predicted with CBQ-QB3. The idea here is to benchmark DFT performance for simple reactions of specific interest to organic chemists. These reactions are of notable current interest due their involvement in organocatalytic enantioselective chemistry (see my posts on the aldol, Mannich, and Hajos-Parrish-Eder-Sauer-Wiechert reaction). Examples of the reactions studied (along with their enthalpies at CBS-QB3) are Reaction 1-3.

Reaction 1

Reaction 2

Reaction 3

For the four simple aldol reactions and four simple Mannich reactions, PBE1PBE,
mPW1PW91 and MO6-2X all provided reaction enthalpies with errors of about 2 kcal mol-1. The much maligned B3LYP functional, along with B3PW91 and B1B95 gave energies with significant larger errors. For the three α-aminoxylation reactions, the errors were better with B3PW91 and B1B95 than with PBE1PBE or MO6-2X. Once again, it appears that one is faced with finding the right functional for the reaction under consideration!

Of particular interest is the decomposition of these reactions into related isogyric, isodesmic
and homdesmic reactions. So for example Reaction 1 can be decomposed into Reactions 4-7 as shown in Scheme 1. (The careful reader might note that these decomposition reactions are isodesmic and homodesmotic and hyperhomodesmotic reactions.) The errors for Reactions 4-7 are typically greater than 4 kcal mol-1 using B3LYP or B3PW91, and even with MO6-2X the errors are about 2 kcal mol-1.

Scheme 1.

Houk also points out that Reactions 4, 8 and 9 (Scheme 2) focus on having similar bond changes as in Reactions 1-3. And it’s here that the results are most disappointing. The errors produced by all of the functionals for Reactions 4,8 and 9 are typically greater than 2 kcal mol-1, and even MO2-6x can be in error by as much as 5 kcal mol-1. It appears that the reasonable performance of the density functionals for the “real world” aldol and Mannich reactions relies on fortuitous cancellation of errors in the underlying reactions. Houk calls for the development of new functionals designed to deal with fundamental simple bond changing reactions, like the ones in Scheme 2.

Scheme 2


(1) Wheeler, S. E.; Moran, A.; Pieniazek, S. N.; Houk, K. N., "Accurate Reaction Enthalpies and Sources of Error in DFT Thermochemistry for Aldol, Mannich, and α-Aminoxylation Reactions," J. Phys. Chem. A 2009, 113, 10376-10384, DOI: 10.1021/jp9058565

(2) Korth, M.; Grimme, S., ""Mindless" DFT Benchmarking," J. Chem. Theory Comput. 2009, 5, 993–1003, DOI: 10.1021/ct800511q

aldol &DFT &Houk &Mannich Steven Bachrach 01 Mar 2010 1 Comment

Benchmarking DFT for alkane conformers

Another benchmark study of the performance of different functionals – this time looking at the conformations of small alkanes.1 Martin first establishes high level benchmarks: the difference between the trans and gauche conformers of butane: CCSD(T)/cc-pVQZ, 0.606 kcal mol-1 and W1h-val, 0.611 kcal mol-1; and the energy differences of the conformers of pentane, especially the TT and TG gap: 0.586 kcal mol-1 at CCSD(T)/cc-pVTZ and 0.614 kcal mol-1 at W1h-val.

They then examine the relative conformational energies of butane, pentane, hexane and a number of branched alkanes with a slew of functionals, covering the second through fifth rung of Perdew’s Jacob’s ladder. The paper has a whole lot of data – and the supporting
include Jmol-enhanced visualization of the structures! – but the bottom line is the following. The traditionally used functionals (B3LYP, PBE, etc) overestimate conformer energies while the MO6 family underestimates the interaction energies that occur in GG-type conformers. A dispersion correction tends to overcorrect and leads to wrong energy ordering of conformers. But the new double-hybrid functionals (B2GP-PLYP and B2K-PLYP) with the dispersion correction provide quite nice agreement with the CCSD(T) benchmarks.

Also worrisome is that all the functionals have issues in geometry prediction, particularly in the backbone dihedral angles. So, for example, B3LYP misses the τ1 dihedral angle in the GG conformer by 5° and even MO6-2x misses the τ2 angle in the TG conformer by 2.4&deg.


(1) Gruzman, D.; Karton, A.; Martin, J. M. L., "Performance of Ab Initio and Density Functional Methods for Conformational Equilibria of CnH2n+2 Alkane Isomers (n = 4-8),"
The Journal of Physical Chemistry A 2009, 113, 11974–11983 , DOI:

DFT Steven Bachrach 06 Nov 2009 2 Comments

TD-DFT benchmark study

Here’s another extensive benchmarking study – this time on the use of TD-DFT to predict excitation energies.1 This study looks at the performance of 28 different functionals, and compares the TD-DFT excitation energies against a data set of (a) computed vertical energies and (b) experimental energies. The performance is generally about the same for both data sets, with many functionals (especially the hybrid functionals) giving errors of about 0.3 eV. Performance can be a bit better when examining subclasses of compounds. For example, PBE0 and mPW1PW91 have a mean unsigned error of only 0.14 eV for a set of organic dyes.


(1) Jacquemin, D.; Wathelet, V.; Perpete, E. A.; Adamo, C., "Extensive TD-DFT Benchmark: Singlet-Excited States of Organic Molecules," J. Chem. Theory Comput., 2009, 5, 2420-2435, DOI: 10.1021/ct900298e

DFT Steven Bachrach 28 Oct 2009 No Comments

Next Page »