De novo generated combinatorial library design

Simon Viet Johansson; Morteza Haghir Chehreghani; Ola Engkvist; Alexander Schliep

doi:10.1039/D3DD00095H

De novo generated combinatorial library design†

Simon Viet Johansson,

*^ab Morteza Haghir Chehreghani,

^b Ola Engkvist

^ab and Alexander Schliep

^bc

Author affiliations

* Corresponding authors

^a Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
E-mail: simon.johansson@astrazeneca.com

^b Department of Computer Science and Engineering, University of Gothenburg, Chalmers University of Technology, Gothenburg, Sweden

^c Faculty of Health Sciences, Brandenburg University of Technology Cottbus-Senftenberg, Cottbus, Germany

Abstract

Artificial intelligence (AI) contributes new methods for designing compounds in drug discovery, ranging from de novo design models suggesting new molecular structures or optimizing existing leads to predictive models evaluating their toxicological properties. However, a limiting factor for the effectiveness of AI methods in drug discovery is the lack of access to high-quality data sets leading to a focus on approaches optimizing data generation. Combinatorial library design is a popular approach for bioactivity testing as a large number of molecules can be synthesized from a limited number of building blocks. We propose a framework for designing combinatorial libraries using a molecular generative model to generate building blocks de novo, followed by using k-determinantal point processes and Gibbs sampling to optimize a selection from the generated blocks. We explore optimization of biological activity, Quantitative Estimate of Drug-likeness (QED) and diversity and the trade-offs between them, both in single-objective and in multi-objective library design settings. Using retrosynthesis models to estimate building block availability, the proposed framework is able to explore the prospective benefit from expanding a stock of available building blocks by synthesis or by purchasing the preferred building blocks before designing a library. In simulation experiments with building block collections from all available commercial vendors near-optimal libraries could be found without synthesis of additional building blocks; in other simulation experiments we showed that even one synthesis step to increase the number of available building blocks could improve library designs when starting with an in-house building block collection of reasonable size.

Digital Discovery

De novo generated combinatorial library design†

Abstract

Supplementary files

Transparent peer review

Article information

Download Citation

Permissions

De novo generated combinatorial library design

Social activity

Search articles by author

Spotlight

Advertisements