A focus on the use of real-world datasets for yield prediction
Abstract
The prediction of reaction yields remains a challenging task for machine learning (ML), given the vast search spaces and absence of robust training data. Wiest, Chawla et al. (https://doi.org/10.1039/D2SC06041H) show that a deep learning algorithm performs well on high-throughput experimentation data but surprisingly poorly on real-world, historical data from a pharmaceutical company. The result suggests that there is considerable room for improvement when coupling ML to electronic laboratory notebook data.
- This article is part of the themed collection: Chemical Science Focus Articles, 2024