What is it about?

The paper concerns reducing the amount of training data required to generate a machine learning model of a potential energy surface for a given intermolecular interaction. These machine-learned potentials (MLPs) often employ a cross-over distance, beyond which predictions switch from the machine learning method to a long-range approximation. Previously, said distance was fixed in advance of training, but here we show that learning this distance from the reference data can reduce the number of training points required to achieve a given accuracy.

Featured Image

Why is it important?

The machine learning method employed here is a Gaussian process (GP). Though GPs have been found to produce more accurate MLPs than, for example, neural networks, the computational cost of prediction and re-optimising GPs scales poorly with the amount of data used in training. Consequently, in order to harness the predictive power of GPs in this context, reducing the size of the data sets employed in training them is paramount. Moreover, a smaller training set can be upgraded to comprise more accurate and expensive training data at a reduced computational expense compared to a larger one. The resultant models have potential applications in molecualr simulations, such as those used to infer conditions in a carbon capture pipeline.

Read the Original

This page is a summary of: Gaussian process models of potential energy surfaces with boundary optimization, The Journal of Chemical Physics, October 2021, American Institute of Physics,
DOI: 10.1063/5.0063534.
You can read the full text:



The following have contributed to this page