Navigating chemical reaction space – application to DNA-encoded chemistry†
Abstract
Databases contain millions of reactions for compound synthesis, rendering selection of reactions for forward synthetic design of small molecule screening libraries, such as DNA-encoded libraries (DELs), a big data challenge. To support reaction space navigation, we developed the computational workflow Reaction Navigator. Reaction files from a large chemistry database were processed using the open-source KNIME Analytics Platform. Initial processing steps included a customizable filtering cascade that removed reactions with a high probability to be incompatible with DEL, as they would e.g. damage the genetic barcode, to arrive at a comprehensive list of transformations for DEL design with applicability potential. These reactions were displayed and clustered by user-defined molecular reaction descriptors which are independent of reaction core substitution patterns. Thanks to clustering, these can be searched manually to identify reactions for DEL synthesis according to desired reaction criteria, such as ring formation or sp3 content. The workflow was initially applied for mapping chemical reaction space for aromatic aldehydes as an exemplary functional group often used in DEL synthesis. Exemplary reactions have been successfully translated to DNA-tagged substrates and can be applied to library synthesis. The versatility of the Reaction Navigator was then shown by mapping reaction space for different reaction conditions, for amines as a second set of starting materials, and for data from a second database.