What is it about?

Semiempirical quantum chemistry methods are methods with intermediate cost and accuracy between DFT and molecular mechanics (MM) methods. Most popular semiempirical methods employ the neglect of differential diatomic overlap (NDDO) approximation, which is very crude and typically necessitates heavy parameterization, leading to over-parameterized, poorly interpretable and poorly transferable models. In this work, we develop a new semiempirical method, dubbed NOTCH (Natural Orbital Tied Constructed Hamiltonian), which goes beyond the NDDO approximation and features a greatly reduced amount of empiricism, since all but 8 parameters of the method are derived from theory.

Featured Image

Why is it important?

Contrary to the common strategy of using more free parameters (sometimes even machine learning models) to improve the accuracy of semiempirical methods, NOTCH describes many physical effects with separate, specially tailored correction terms, many of which were never used in existing semiempirical methods. In particular, separate terms with physically justified forms were used to account for the radial and angular incompleteness of the minimal basis set, as well as the non-dynamic, local dynamic and dispersive components of the correlation. This stands in stark contrast with the usual approach of trying to absorb errors into terms that were not designed to describe that effect (e.g. absorb non-dynamic correlation into dynamic correlation corrections), with the risk of overfitting. Our preliminary results suggest that NOTCH is more robust and frequently more accurate than existing semiempirical methods, while being at most 3~5 times more expensive than the latter. Our big message is that there is still room for improving existing semiempirical methods with a physically motivated (rather than data-driven) approach.

Perspectives

Unlike what was said in the "why is it important" section, in the first year of the project, I designed a more and more complicated theory, and threw more and more terms and parameters into it, with the sole aim of improving the variational freedom of the method so that I can get lower fitting error against a training set (which included a few subsets of the GMTKN55 benchmark set). Eventually the theory grew into a 19-page monster, and 166 parameters were introduced. The terms generally respected exact asymptotic conditions (e.g. the infinite R asymptotic behavior of integrals), but did not have physically motivated forms in regions where no exact condition is available. The fitted parameters behaved mostly randomly as a function of the element type, suggesting that they were not physically justified, but merely served as mathematical devices that absorbed the fitting error. Although the model was still relatively transferable, and I thought the model was already less empirical than existing semiempirical methods, we (especially Frank) were not satisfied. What Frank said, and I still remember today, was that the average reviewer would get bored at the fifth page of the theory. Then, five months before the paper was submitted, we painstakingly redesigned the whole theory. This time, instead of just optimizing for the training loss, we primarily optimized for the number of pages necessary for describing the theory (or, using computational theory jargon, the "Kolmogorov complexity" of our model). Terms that can be described in fewer sentences and formulas replaced those that need more text to describe, even if the former are harder to implement, are costlier to calculate, or gave slightly inferior accuracy. This strategy worked surprisingly well, and although it is an extremely non-convex optimization problem requiring extensive human input, it gave a theory describable within 8 pages after 3 months. Meanwhile, corrections that needed less code but more parameters were systematically replaced by those that needed more code but less parameters: for example, correction factors that were a Gaussian function with an element- and shell-dependent exponent were replaced by a ratio of two complicated integrals that involve the LDA correlation kernel (Eq. (29)), which automatically determines the correct rate of decay for each element (in a physically explainable way) at an expense of mathematical complexity. This gave a model with only 8 empirical parameters, with superior accuracy for the systems that we tested (atoms and diatomics). I would particularly like to mention that the last piece of the puzzle, namely the left-right correlation (LRC) correction (Sec. II H), was a considerable pain to come up with: previously our functional form was not designed for recovering the LRC, and the parameters that could fortuitously absorb the LRC were already removed by our own hands. After weeks of fruitless efforts, I found the elegant idea of approximating the LRC as proportional to the 2-RDM equivalent of the Wiberg bond order, and determine the prefactor by studying homodiatomic molecules. To our best knowledge, this is a new idea that can potentially be useful in e.g. the DFT community as well, to capture non-dynamic correlation while still pertaining to a single-determinantal picture. Despite our preliminary success, our endeavor with the NOTCH method has just begun; in particular, there are more than one possible way to apply NOTCH to multiatomic molecules, and we are still experimenting over this. We hope that the final version of NOTCH will inherit the desirable properties of this first version, i.e. physically motivated, minimally empirical, and robust in corner cases.

Zikuan Wang
Max-Planck-Institut für Kohlenforschung

Read the Original

This page is a summary of: Development of NOTCH, an all-electron, beyond-NDDO semiempirical method: Application to diatomic molecules, The Journal of Chemical Physics, May 2023, American Institute of Physics,
DOI: 10.1063/5.0141686.
You can read the full text:

Read

Contributors

The following have contributed to this page