Equipping data-driven experiment planning for Self-driving Laboratories with semantic memory: case studies of transfer learning in chemical reaction optimization†
Abstract
Optimization strategies based on machine learning (ML), such as Bayesian optimization, show promise across the experimental sciences as a superior alternative to traditional design of experiments. Additional benefits are captured when these ML algorithms are combined with automated laboratory equipment using Atinary's orchestration software platform SDLabs https://www.atinary.com. The synergy of these technologies is referred to as Self-driving Laboratories, which hold the potential to revolutionize scientific experimentation. Thus far, however, autonomous experimentation projects have not fully leveraged pre-existing knowledge, often beginning from scratch and sequentially collecting measurements from new experiments. This is in stark contrast to experimentation by humans, where trained domain experts rely on intuition to select initial parameter settings for a novel optimization campaign. In this work, we introduce Atinary's transfer learning algorithm SeMOpt, a general-purpose, model agnostic Bayesian optimization framework which uses meta-/few-shot learning to efficiently transfer knowledge from related historical experiments to a novel experimental campaign via a compound acquisition function. We apply SeMOpt to chemical reaction optimization via two case studies: i) the optimization of five simulated cross-coupling reactions, which demonstrates the ability of our approach to adapt to data with unknown effects; ii) the optimization of palladium-catalyzed Buchwald–Hartwig cross-coupling of aryl halides with 4-methylaniline in the presence of potentially inhibitory additives. SeMOpt accelerates the optimization rate by up to a factor of 10 compared to standard single-task ML optimizers (those without transfer learning capabilities). Moreover, SeMOpt outperforms several existing Bayesian optimization strategies that leverage historical data. Thus, we believe this work presents a valuable technical contribution for general-purpose optimization and strengthens the case to replace the traditional trial-and-error experimentation process with Self-driving Labs augmented with semantic memory.