Back to Search Start Over

Improved decision making with similarity based machine learning

Authors :
Lemm, Dominik
von Rudorff, Guido Falk
von Lilienfeld, O. Anatole
Publication Year :
2022

Abstract

Despite their fundamental importance for science and society at large, experimental design decisions are often plagued by extreme data scarcity which severely hampers the use of modern ready-made machine learning models as they rely heavily on the paradigm, 'the bigger the data the better'. Presenting similarity based machine learning we show how to reduce these data needs such that decision making can be objectively improved in certain problem classes. After introducing similarity machine learning for the harmonic oscillator and the Rosenbrock function, we describe real-world applications to very scarce data scenarios which include quantum mechanics based molecular design and organic synthesis planning. Analysis of the query and training data proximity confirms that only a fraction of data is necessary to converge to competitive performance. Finally, we derive a relationship between the intrinsic dimensionality and volume of feature space, governing the overall model accuracy.

Subjects

Subjects :
Physics - Chemical Physics

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2205.05633
Document Type :
Working Paper