Archive for the 'DFT' Category

Assigning a computed NMR spectrum – the case of one diastereomer

What procedure should one employ when trying to determine a chemical structure from an NMR spectrum? I have discussed a number of such examples in the past, most recently the procedure by Goodman for dealing with the situation where one has the experimental spectra of 2 diastereomers and you are trying to identify the structures of this pair.1 Now, Goodman provides an extension for the situation where you have a single experimental NMR spectrum and you are trying to determine which of a number of diasteromeric structures best accounts for this spectrum.2 Not only does this prescription provide a means for identifying the best structure, it also provides a confidence level.

The method, called DP4, works as follows. First, perform an MM conformational search of every diastereomer. Select the conformations within 10 kJ of the global minimum and compute the 13C and 1H NMR chemical shifts at B3LYP/6->31G(d,p) – note no reoptimizations! Then compute the Boltzmann weighted average chemical shift. Scale these shifts against the experimental values. You’re now ready to apply the DP4 method. Compute the error in each chemical shift. Determine the probability of this error using the Student’s t test (with mean, standard deviation, and degrees of freedom as found using their database of over 1700 13C and over 1700 1H chemical shifts). Lastly, the DP4 probability is computed as the product of these probabilities divided by the sum of the product of the probabilities over all possible diastereomers. This process is not particularly difficult and Goodman provides a Java applet to perform the DP4 computation for you!

In the paper Smith and Goodman demonstrate that in identifying structures for a broad range of natural products, the DP4 method does an outstanding job at identifying the correct diastereomer, and an even better job of not misidentifying a wrong structure to the spectrum. Performance is markedly better than the typical procedures used, like using the correlation coefficient or mean absolute error. I would strongly encourage those people utilizing computed NMR spectra for identifying chemical structures to considering employing the DP4 method – the computational method is not particularly computer-intensive and the quality of the results is truly impressive.

Afternote: David Bradley has a nice post on this paper, including some comments from Goodman.

References

(1) Smith, S. G.; Goodman, J. M., "Assigning the Stereochemistry of Pairs of Diastereoisomers Using GIAO NMR Shift Calculation," J. Org. Chem., 2009, 74, 4597-4607, DOI: 10.1021/jo900408d

(2) Smith, S. G.; Goodman, J. M., "Assigning Stereochemistry to Single Diastereoisomers by GIAO NMR Calculation: The DP4 Probability," J. Am. Chem. Soc., 2010, 132, 12946-12959, DOI: 10.1021/ja105035r

DFT &NMR Steven Bachrach 28 Sep 2010 6 Comments

Origin of DFT failures – part II

Here’s one more attempt to discern the failure of DFT to handle simple alkanes (see this earlier post for a previous attempt to answer this question). Tsuneda and co-workers1 have employed long-range corrected (LC) DFT to the problem of the energy associated with “protobranching”, i.e., from the reaction

CH3(CH2nCH3 + n CH4 → (n+1) CH3CH3

They computed the energy of this reaction for the normal alkanes propane through decane using a variety of functionals, and compared these computed values with experimentally-derived energies. Table 1 gives the mean unsigned error for a few of the functionals. The prefix “LC” indicated inclusion of long-range corrections, “LCgau” indicates the LC scheme with a gaussian attenuation, and “LRD” indicates inclusion of long-range dispersion.

Table 1. Mean unsigned errors of the “protobranching”
reaction energy of various functional compared to experiment.

Functional

MUE
(kcal mol-1)


LC2gau-BLYP+LRD

0.09

LC-PBE+LRD

0.17

SVWN5

0.27

LCgau-PBE

1.56

M06

1.98

LC-PBE

2.24

M06-2x

3.40

B3LYP

5.97

HF

6.96


A number of important conclusions can be drawn. First, with both LC and LRD very nice agreement with experiment can be had. If only LC is included, the error increases on average by over 1 kcal mol-1. The MO6-2x functional, touted as a fix of the problem, does not provide complete correction, though it is vastly superior to B3LYP and other hybrid functionals. The authors conclude that the need for LC incorporation points out that the exchange functional lacks the ability to account for this effect. Medium-range correlation is not the main source of the problem as large discrepancies in the reaction energy error occur when different functionals are used that are corrected for LC and LRD. Choice of functional still matters, but LC correction appears to be a main culprit and further studies of its addition to standard functionals would be most helpful.

References

(1) Song, J.-W.; Tsuneda, T.; Sato, T.; Hirao, K., "Calculations of Alkane Energies Using Long-Range Corrected DFT Combined with Intramolecular van der Waals Correlation," Org. Lett. 2010, 12, 1440–1443, DOI: 10.1021/ol100082z

DFT Steven Bachrach 25 May 2010 6 Comments

Benchmarking DFT for the aldol and Mannich Reactions

Houk has performed a very nice examination of the performance of some density functionals.1 He takes a quite different approach than what was proposed by Grimme – the “mindless” benchmarking2 using random molecules (see this post). Rather, Houk examined a series of simple aldol, Mannich and α-aminoxylation reactions, comparing their reaction energies predicted with DFT against that predicted with CBQ-QB3. The idea here is to benchmark DFT performance for simple reactions of specific interest to organic chemists. These reactions are of notable current interest due their involvement in organocatalytic enantioselective chemistry (see my posts on the aldol, Mannich, and Hajos-Parrish-Eder-Sauer-Wiechert reaction). Examples of the reactions studied (along with their enthalpies at CBS-QB3) are Reaction 1-3.

Reaction 1

Reaction 2

Reaction 3

For the four simple aldol reactions and four simple Mannich reactions, PBE1PBE,
mPW1PW91 and MO6-2X all provided reaction enthalpies with errors of about 2 kcal mol-1. The much maligned B3LYP functional, along with B3PW91 and B1B95 gave energies with significant larger errors. For the three α-aminoxylation reactions, the errors were better with B3PW91 and B1B95 than with PBE1PBE or MO6-2X. Once again, it appears that one is faced with finding the right functional for the reaction under consideration!

Of particular interest is the decomposition of these reactions into related isogyric, isodesmic
and homdesmic reactions. So for example Reaction 1 can be decomposed into Reactions 4-7 as shown in Scheme 1. (The careful reader might note that these decomposition reactions are isodesmic and homodesmotic and hyperhomodesmotic reactions.) The errors for Reactions 4-7 are typically greater than 4 kcal mol-1 using B3LYP or B3PW91, and even with MO6-2X the errors are about 2 kcal mol-1.

Scheme 1.

Houk also points out that Reactions 4, 8 and 9 (Scheme 2) focus on having similar bond changes as in Reactions 1-3. And it’s here that the results are most disappointing. The errors produced by all of the functionals for Reactions 4,8 and 9 are typically greater than 2 kcal mol-1, and even MO2-6x can be in error by as much as 5 kcal mol-1. It appears that the reasonable performance of the density functionals for the “real world” aldol and Mannich reactions relies on fortuitous cancellation of errors in the underlying reactions. Houk calls for the development of new functionals designed to deal with fundamental simple bond changing reactions, like the ones in Scheme 2.

Scheme 2

References

(1) Wheeler, S. E.; Moran, A.; Pieniazek, S. N.; Houk, K. N., "Accurate Reaction Enthalpies and Sources of Error in DFT Thermochemistry for Aldol, Mannich, and α-Aminoxylation Reactions," J. Phys. Chem. A 2009, 113, 10376-10384, DOI: 10.1021/jp9058565

(2) Korth, M.; Grimme, S., ""Mindless" DFT Benchmarking," J. Chem. Theory Comput. 2009, 5, 993–1003, DOI: 10.1021/ct800511q

aldol &DFT &Houk &Mannich Steven Bachrach 01 Mar 2010 1 Comment

Predicting aqueous pKa

Predicting the pKa of a compound from first principles remains a challenge, despite all of the many algorithmic and methodological advantages within the sphere of computational chemistry. Predicting the gas-phase deprotonation energy is relatively straightforward as I detail in Section 2.2 of my book. The difficulty is in treating the solvent and the interaction of the acid and its conjugate base in solution. Considering that we are most interested in acidities in water, a very polar solvent, the interactions between water and the conjugate base and the proton are likely to be large and important!

Baker and Pulay report a procedure for determining acidities with the aim of high throughput.1 Thus, computational efficiency is a primary goal. Their approach is to compute the enthalpy change for deprotonation in solution using a continuum treatment and then employ a linear fit to predict the pKa with the equation:

pKa(c) = αcΔH + βc

where c designates a class of compound, such as alcohol, carboxylic acid, amine, etc. Fitting constants αc and βc need to be found then for each unique class of compound, where the fitting is to experimental pKas in water. In their test suite, they employed eleven anilines and amines, seven pyridines, nine alcohols and phenols, and seven carboxylic acids.

They test a number of different computational variants: (a) what functional to employ, (b) what basis set to use for optimizing structures, and (c) what basis set to use for the enthalpy computation. They opt to employ COSMO for treating the solvent and quickly reject the use of gas phase structures (and particularly use of geometries obtained with molecular mechanics. Their ultimate model is OLYP/6-311+G**//3-21G(d) with the COSMO solvation model. Mean deviation is less than 0.4 pK units. They do note that use of HF or PW91 provides similar small errors, but ultimately favor OLYP for its computational performance.

While this procedure offers some guidance for future computation of acidity, I find a couple of issues. First, it relies on fitted parameters for every class of compound. If one is interested in a new class, then one must develop the appropriate parameters – and the experimental values may not be available or perhaps an insufficient number of them are experimentally available. Second, the parameters cover-up a great deal of problematic computational sins, like the solvation energy of the proton, small basis sets, missing correlation energies, missing dispersion corrections etc. A purist might hope for a computational algorithm that allows for systematic correction and improvement in the estimation of pKas. Further work needs to be done to meet this higher goal.

References

(1) Zhang, S.; Baker, J.; Pulay, P., "A Reliable and Efficient First Principles-Based Method for Predicting pKa Values. 1. Methodology," J. Phys. Chem. A 2010, 114 , 425-431, DOI: 10.1021/jp9067069

Acidity &DFT Steven Bachrach 08 Feb 2010 No Comments

Benchmarking DFT for alkane conformers

Another benchmark study of the performance of different functionals – this time looking at the conformations of small alkanes.1 Martin first establishes high level benchmarks: the difference between the trans and gauche conformers of butane: CCSD(T)/cc-pVQZ, 0.606 kcal mol-1 and W1h-val, 0.611 kcal mol-1; and the energy differences of the conformers of pentane, especially the TT and TG gap: 0.586 kcal mol-1 at CCSD(T)/cc-pVTZ and 0.614 kcal mol-1 at W1h-val.

They then examine the relative conformational energies of butane, pentane, hexane and a number of branched alkanes with a slew of functionals, covering the second through fifth rung of Perdew’s Jacob’s ladder. The paper has a whole lot of data – and the supporting
materials
include Jmol-enhanced visualization of the structures! – but the bottom line is the following. The traditionally used functionals (B3LYP, PBE, etc) overestimate conformer energies while the MO6 family underestimates the interaction energies that occur in GG-type conformers. A dispersion correction tends to overcorrect and leads to wrong energy ordering of conformers. But the new double-hybrid functionals (B2GP-PLYP and B2K-PLYP) with the dispersion correction provide quite nice agreement with the CCSD(T) benchmarks.

Also worrisome is that all the functionals have issues in geometry prediction, particularly in the backbone dihedral angles. So, for example, B3LYP misses the τ1 dihedral angle in the GG conformer by 5° and even MO6-2x misses the τ2 angle in the TG conformer by 2.4&deg.

References

(1) Gruzman, D.; Karton, A.; Martin, J. M. L., "Performance of Ab Initio and Density Functional Methods for Conformational Equilibria of CnH2n+2 Alkane Isomers (n = 4-8),"
The Journal of Physical Chemistry A 2009, 113, 11974–11983 , DOI: http://dx.doi.org/10.1021/jp903640h

DFT Steven Bachrach 06 Nov 2009 2 Comments

TD-DFT benchmark study

Here’s another extensive benchmarking study – this time on the use of TD-DFT to predict excitation energies.1 This study looks at the performance of 28 different functionals, and compares the TD-DFT excitation energies against a data set of (a) computed vertical energies and (b) experimental energies. The performance is generally about the same for both data sets, with many functionals (especially the hybrid functionals) giving errors of about 0.3 eV. Performance can be a bit better when examining subclasses of compounds. For example, PBE0 and mPW1PW91 have a mean unsigned error of only 0.14 eV for a set of organic dyes.

References

(1) Jacquemin, D.; Wathelet, V.; Perpete, E. A.; Adamo, C., "Extensive TD-DFT Benchmark: Singlet-Excited States of Organic Molecules," J. Chem. Theory Comput., 2009, 5, 2420-2435, DOI: 10.1021/ct900298e

DFT Steven Bachrach 28 Oct 2009 No Comments

CD of high-symmetry molecules

I have written a number of blog posts that deal with the computation of optical activity. Trindle and Altun have now reported TD-DFT computations of circular dichroism of high-symmetry molecules.1 The employ either B3LYP (with a variety of basis sets, the largest being 6-311++G(2d,2p)) and SOAP/ATZP. For a number of the high symmetry molecules (two examples are shown in Figure 1), the two methods differ a bit in their predictions of the first excited state, with SOAP typically predicting a red shift relative to the B3LYP. However, both methods general give the same sign of the CD signals and their line shapes are similar.


1


2

Figure 1. B3LYP/6-31G(d) optimized structures of 1 and 2 (again due to incomplete supporting materials, I reoptimized these structures)

References

(1) Trindle, C.; Altun, Z., "Circular dichroism of some high-symmetry chiral molecules: B3LYP and SAOP calculations " Theor. Chem. Acc. 2009, 122, 145-155, DOI: 10.1007/s00214-008-0494-8.

InChIs

1: InChI=1/C18H14O2/c19-15-7-11-3-1-4-12-8-16(20)10-14-6-2-5-13(9-15)18(14)17(11)12/h1-6H,7-10H2
InChIKey=DYZSIUYFWKNLHS-UHFFFAOYAB

2: InChI=1/C20H24/c1-13-9-18-7-8-20-12-15(3)19(11-16(20)4)6-5-17(13)10-14(18)2/h9-12H,5-8H2,1-4H3
InChIKey=JTMLLDPOLFRPGJ-UHFFFAOYAC

DFT &Optical Rotation Steven Bachrach 27 Jul 2009 No Comments

Si-PETN sensitivity explained

PETN C(CH2ONO2)3 is a relatively insensitive explosive. The silicon analogue Si(CH2ONO2)3 is extraordinarily sensitive, exploding at the touch of a spatula. (By the way, this makes it extremely ill-advised as an explosive – it’s way too dangerous!) Goddard employed MO6 computations to explore five different possible decomposition pathways, shown in Scheme 1.1 Reaction 1, the loss of NO2, is a standard decomposition pathway for many explosives, but the barrier for the C and Si analogues are similar and the reaction of the Si compound is not exothermic. The barrier for Reaction 2 is very large, and the barriers for the C and Si analogues for Reactions 3 and 4 are too similar to explain the differences in their sensitivities.

Scheme 1.

Reaction 5, however, does offer an explanation. The barrier for the Si analogue is 32 kcal mol-1, lower than for any other pathway, and almost 50 kcal mol-1 lower than the barrier for the rearrangement of the PETN itself. Furthermore, Reaction 5 is very exothermic for Si-PETN (-44.5 kcal mol-1), while the most favorable pathway for PETN decomposition, Reaction 1, is endothermic. Thus the small barrier and the large amount of energy released for Reaction 5 of Si-PETN suggests its extreme sensitivity.

References

(1) Liu, W.-G.; Zybin, S. V.; Dasgupta, S.; Klapötke, T. M.; Goddard III, W. A., "Explanation of the Colossal Detonation Sensitivity of Silicon Pentaerythritol Tetranitrate (Si-PETN) Explosive," J. Am. Chem. Soc. 2009, 131, 7490-7491, DOI: 10.1021/ja809725p.

InChIs

PETN: InChI=1/C5H8N4O12/c10-6(11)18-1-5(2-19-7(12)13,3-20-8(14)15)4-21-9(16)17/h1-4H2
InChIKey=TZRXHJWUDPFEEY-UHFFFAOYAE

Si-PETN: InChI=1/C4H8N4O12Si/c9-5(10)17-1-21(2-18-6(11)12,3-19-7(13)14)4-20-8(15)16/h1-4H2
InChIKey=FBKTZZKPJKPXMT-UHFFFAOYAL

DFT Steven Bachrach 20 Jul 2009 1 Comment

Computing 1H NMR chemical shifts

Computed NMR spectra have been a major theme of the blog (see these posts). General consensus is that they can be enormously helpful in characterizing structures and stereochemistry, but there has been a nagging sense that one needs to use very large basis sets to get reasonable accuracies.

Bally and Rablen1 now confront that claim and suggest instead that quite modest basis sets along with a number of flavors of DFT can provide very good 1H NMR shifts. They examined 80 organic molecules spanning a variety of functional groups. A key feature is that these molecules exist as a single conformation or their conformational distribution is dominated by one conformer. This avoids the need of computing a large number of conformers and taking a Boltzman average of their shifts – a task that would likely require a much larger basis set than what they hope to get away with.

The most important conclusion: the WP04 functional,2 developed by Cramer to predict proton spectra, with the very small 6-31G(d,p) basis set and incorporation of the solvent through PCM provides excellent cost/benefit performance. The rms error of the proton chemical shifts is 0.198 ppm, and this can be reduced to 0.140 ppm with scaling. The 6-31G(d) basis set is even better if one uses a linear scaling; its error is only 0.120 ppm. B3LYP/6-31G(d,p) has an rms only somewhat worse. Use of aug-cc-pVTZ basis sets, while substantially more time consuming, provides inferior predictions.

The authors contend that this sort of simple DFT computation, affordable for many organic systems on standard desktop PCs, should be routinely done, especially in preference to increment schemes that are components of some drawing programs. And if a synthesis group does not have the tools to do this sort of work, I’m sure there are many computational chemists that would be happy to collaborate!

References

(1) Jain, R.; Bally, T.; Rablen, P. R., "Calculating Accurate Proton Chemical Shifts of Organic Molecules with Density Functional Methods and Modest Basis Sets," J. Org. Chem. 2009, DOI: 10.1021/jo900482q.

(2) Wiitala, K. W.; Hoye, T. R.; Cramer, C. J., "Hybrid Density Functional Methods Empirically Optimized for the Computation of 13C and 1H Chemical Shifts in Chloroform Solution," J. Chem. Theory Comput. 2006, 2, 1085-1092, DOI: 10.1021/ct6001016

DFT &NMR Steven Bachrach 15 Jun 2009 3 Comments

More DFT benchmarks – sugars and “mindless” test sets

Another two benchmarking studies of the performance of DFT have appeared.

The first is an examination by Csonka and French of the ability of DFT to predict the relative energy of carbohydrate conformation energies.1 They examined 15 conformers of α- and β-D-allopyranose, fifteen conformations of 3,6-anydro-4-O-methyl-D-galactitol and four conformers of β-D-glucopyranose. The energies were referenced against those obtained at MP2/a-cc-pVTZ(-f)//B3LYP/6-31+G*. (This unusual basis set lacks the f functions on heavy atoms and d and diffuse functions on H.) Among the many comparisons and conclusions are the following: B3LYP is not the best functional for the sugars, in fact all other tested hybrid functional did better, with MO5-2X giving the best results. They suggest the MO5-2X/6-311+G**//MO5-2x/6-31+G* is the preferred model for sugars, except for evaluating the difference between 1C4 and 4C1 conformers, where they opt for PBE/6-31+G**.

The second, by Korth and Grimme, describes a “mindless” DFT benchmarking study.2
This is really not a “mindless” study (as the term is used by Schaefer and Schleyer3 and discussed in this post, where all searching is done in a totally automated way) but rather Grimme describes a procedure for removing biases in the test set. Selection of “artificial molecules” is made by first deciding how many atoms are to be present and what will be the distribution of elements. In their two samples, they select systems having 8 atoms. The two sets differ by the distribution of the elements. The first set the atoms Na-Cl are one-third as probable as the elements Li-F, which are one-third as probable as H. The second set has the probability distribution similar to those found in naturally occurring organic compounds. The eight atoms, randomly selected by the computer, are placed in the corners of a cube and allowed to optimize (this is reminiscent of the “mindless” procedure of Schaefer and Schleyer3). This process generates a selection of random bonding environments along with open- and closed shell species, and removes (to a large degree) the biases of previous test sets, which are often skewed towards small molecules, ones where accurate experiments are available or geared towards a select group of molecules of interest. Energies are then computed using a variety of functional and compared to the energy at CCSD(T)/CBS. The bottom line is that the functional nicely group along the rungs defined by Perdew:4 LDA is the poorest performer, GGA does much better, the third rung of meta-GGA functionals are slightly better than GGA functionals, hybrids do better still, and the fifth rung functionals (double hybrids) perform quite well. Also of interest is that CCSD(T)/cc-pVDZ gives quite large errors and so Grimme cautions against using this small basis set.

References

(1) Csonka, G. I.; French, A. D.; Johnson, G. P.; Stortz, C. A., "Evaluation of Density Functionals and Basis Sets for Carbohydrates," J. Chem. Theory Comput. 2009, ASAP, DOI: 10.1021/ct8004479.

(2) Korth, M.; Grimme, S., ""Mindless" DFT Benchmarking," J. Chem. Theory Comput. 2009, ASAP, DOI: 10.1021/ct800511q.

(3) Bera, P. P.; Sattelmeyer, K. W.; Saunders, M.; Schaefer, H. F.; Schleyer, P. v. R., "Mindless Chemistry," J. Phys. Chem. A 2006, 110, 4287-4290, DOI: 10.1021/jp057107z.

(4) Perdew, J. P.; Ruzsinszky, A.; Tao, J.; Staroverov, V. N.; Scuseria, G. E.; Csonka, G. I., "Prescription for the design and selection of density functional approximations: More constraint satisfaction with fewer fits," J. Chem. Phys. 2005, 123, 062201-9, DOI: 10.1063/1.1904565

DFT &Grimme Steven Bachrach 21 Apr 2009 3 Comments

« Previous PageNext Page »