Computer aided recipe design: optimization of polydisperse chemical mixtures using molecular descriptors†
Abstract
A workflow has been developed allowing for the computer aided design and optimization of reactive systems using the concept of molecular descriptor-based similarity. Unlike single-molecule models most often used in polymer informatics, an important feature of this approach is to allow for a more realistic description of reaction mixtures by accounting for polydispersity and individual chain topology. Starting from a specific set of ingredients, i.e., a chemical recipe or formulation, simulations based on Gillespie's kinetic Monte Carlo scheme are used to generate oligo- and polymeric reaction mixtures. By using the distance/similarity in molecular and topological descriptor space as a metric, the initial recipe is then modified iteratively using a Bayesian optimizer. Target of the optimization procedure is either another chemical recipe with different ingredients or alternatively, a set of desirable descriptors and properties. A key step of the process is the transformation of the graph representing individual polymer species as obtained by the kinetic simulation into atomistic species described as SMILES strings, which enables the computation of a rich set of additional descriptors. This rather general mapping is achieved exploiting similarities between the BNGL and the SMILES graph notation. The workflow is demonstrated on common polyether and polyester oligomeric systems as typically used in the chemical industry, but is generally applicable to any other polymer chemistry.