Issue 13, 2025

Adaptive representation of molecules and materials in Bayesian optimization

Abstract

Bayesian optimization (BO) is increasingly used in molecular optimization and in guiding self-driving laboratories for automated materials discovery. A crucial aspect of BO is how molecules and materials are represented as feature vectors, where both the completeness and compactness of these representations can influence the efficiency of the optimization process. Traditionally, a fixed representation is chosen by expert chemists or applying data-driven feature selection methods on available labeled datasets. However, when dealing with novel optimization tasks, prior knowledge or large datasets are often unavailable, and relying on these even can introduce bias into the search process. In this work, we demonstrate a Feature Adaptive Bayesian Optimization (FABO) framework, which integrates feature selection in the Bayesian optimization process with Gaussian processes to dynamically adapt material representations throughout the optimization cycles. We demonstrate the effectiveness of this adaptive approach across several molecular optimization tasks, including the discovery of high-performing metal–organic frameworks (MOFs) in three distinct tasks, each involving unique property distributions and requiring a distinct representation. Our results show that the adaptive nature of the representation leads to outperforming random search baseline and scenarios where prior knowledge of the feature space is available. Notably, for known optimization tasks, FABO automatically identifies representations that are aligned with human chemical intuition, validating its utility for optimization tasks where such insights are not available in advance. Lastly, we show how a suboptimal representation, e.g., when missing key features, can adversely impact BO performance, highlighting the importance of starting from a full feature set and adapt it to different tasks. Our findings highlight FABO as a robust approach for navigating large, complex materials search spaces in automated discovery campaigns.

Graphical abstract: Adaptive representation of molecules and materials in Bayesian optimization

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Edge Article
Submitted
13 Nov 2024
Accepted
18 Feb 2025
First published
19 Feb 2025
This article is Open Access

All publication charges for this article have been paid for by the Royal Society of Chemistry
Creative Commons BY license

Chem. Sci., 2025,16, 5464-5474

Adaptive representation of molecules and materials in Bayesian optimization

M. Rajabi-Kochi, N. Mahboubi, A. P. S. Gill and S. M. Moosavi, Chem. Sci., 2025, 16, 5464 DOI: 10.1039/D5SC00200A

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements