One class classification as a practical approach for accelerating π–π co-crystal discovery

Aikaterini Vriza; Angelos B. Canaj; Rebecca Vismara; Laurence J. Kershaw Cook; Troy D. Manning; Michael W. Gaultois; Peter A. Wood; Vitaliy Kurlin; Neil Berry; Matthew S. Dyer; Matthew J. Rosseinsky

doi:10.1039/D0SC04263C

One class classification as a practical approach for accelerating π–π co-crystal discovery†

Aikaterini Vriza,

^ab Angelos B. Canaj,

^a Rebecca Vismara,

^a Laurence J. Kershaw Cook,

^a Troy D. Manning,

^a Michael W. Gaultois,

^ab Peter A. Wood,

^c Vitaliy Kurlin,

^d Neil Berry,

^a Matthew S. Dyer

*^ab and Matthew J. Rosseinsky

^ab

Author affiliations

* Corresponding authors

^a Department of Chemistry and Materials Innovation Factory, University of Liverpool, 51 Oxford Street, Liverpool L7 3NY, UK
E-mail: M.S.Dyer@liverpool.ac.uk

^b Leverhulme Research Centre for Functional Materials Design, University of Liverpool, Oxford Street, Liverpool L7 3NY, UK

^c Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, UK

^d Materials Innovation Factory, Computer Science Department, University of Liverpool, Liverpool, UK

Abstract

The implementation of machine learning models has brought major changes in the decision-making process for materials design. One matter of concern for the data-driven approaches is the lack of negative data from unsuccessful synthetic attempts, which might generate inherently imbalanced datasets. We propose the application of the one-class classification methodology as an effective tool for tackling these limitations on the materials design problems. This is a concept of learning based only on a well-defined class without counter examples. An extensive study on the different one-class classification algorithms is performed until the most appropriate workflow is identified for guiding the discovery of emerging materials belonging to a relatively small class, that being the weakly bound polyaromatic hydrocarbon co-crystals. The two-step approach presented in this study first trains the model using all the known molecular combinations that form this class of co-crystals extracted from the Cambridge Structural Database (1722 molecular combinations), followed by scoring possible yet unknown pairs from the ZINC15 database (21 736 possible molecular combinations). Focusing on the highest-ranking pairs predicted to have higher probability of forming co-crystals, materials discovery can be accelerated by reducing the vast molecular space and directing the synthetic efforts of chemists. Further on, using interpretability techniques a more detailed understanding of the molecular properties causing co-crystallization is sought after. The applicability of the current methodology is demonstrated with the discovery of two novel co-crystals, namely pyrene-6H-benzo[c]chromen-6-one (1) and pyrene-9,10-dicyanoanthracene (2).

This article is part of the themed collections: Editor’s Choice: Malika Jeffries-EL and 2020 Chemical Science HOT Article Collection

Chemical Science

One class classification as a practical approach for accelerating π–π co-crystal discovery†

Abstract

Supplementary files

Article information

Download Citation

Permissions

One class classification as a practical approach for accelerating π–π co-crystal discovery

Social activity

Search articles by author

Spotlight

Advertisements