The resolution-vs.-accuracy dilemma in machine learning modeling of electronic excitation spectra

Prakriti Kayastha; Sabyasachi Chakraborty; Raghunathan Ramakrishnan

doi:10.1039/D1DD00031D

You do not have JavaScript enabled. Please enable JavaScript to access the full features of the site or access our non-JavaScript page.

The resolution-vs.-accuracy dilemma in machine learning modeling of electronic excitation spectra

Prakriti Kayastha,†^a Sabyasachi Chakraborty †^a and Raghunathan Ramakrishnan

*^a

Author affiliations

* Corresponding authors

^a Tata Institute of Fundamental Research Hyderabad, Hyderabad 500046, India
E-mail: ramakrishnan@tifrh.res.in

Abstract

In this study, we explore the potential of machine learning for modeling molecular electronic spectral intensities as a continuous function in a given wavelength range. Since presently available chemical space datasets provide excitation energies and corresponding oscillator strengths for only a few valence transitions, here, we present a new dataset—bigQM7ω—with 12 880 molecules containing up to 7 CONF atoms and report ground state and excited state properties. A publicly accessible web-based data-mining platform is presented to facilitate on-the-fly screening of several molecular properties including harmonic vibrational and electronic spectra. We present all singlet electronic transitions from the ground state calculated using the time-dependent density functional theory framework with the ωB97XD exchange-correlation functional and a diffuse-function augmented basis set. The resulting spectra predominantly span the X-ray to deep-UV region (10–120 nm). To compare the target spectra with predictions based on small basis sets, we bin spectral intensities and show good agreement is obtained only at the expense of the resolution. Compared to this, machine learning models with the latest structural representations trained directly using <10% of the target data recover the spectra of the remaining molecules with better accuracies at a desirable <1 nm wavelength resolution.

Download options Please wait...

Article information

DOI: https://doi.org/10.1039/D1DD00031D
Article type: Paper
Submitted: 30 Oct 2021
Accepted: 18 Aug 2022
First published: 18 Aug 2022
This article is Open Access

Download Citation

Digital Discovery, 2022,1, 689-702

Permissions

Request permissions

The resolution-vs.-accuracy dilemma in machine learning modeling of electronic excitation spectra

P. Kayastha, S. Chakraborty and R. Ramakrishnan, Digital Discovery, 2022, 1, 689 DOI: 10.1039/D1DD00031D

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Social activity

Fetching data from CrossRef.
This may take some time to load.

Digital Discovery

The resolution-vs.-accuracy dilemma in machine learning modeling of electronic excitation spectra

Abstract

Article information

Download Citation

Permissions

The resolution-vs.-accuracy dilemma in machine learning modeling of electronic excitation spectra

Social activity

Search articles by author

Spotlight

Advertisements