Predicting aggregate morphology of sequence-defined macromolecules with recurrent neural networks

Debjyoti Bhattacharya; Devon C. Kleeblatt; Antonia Statt; Wesley F. Reinhart

doi:10.1039/D2SM00452F

Predicting aggregate morphology of sequence-defined macromolecules with recurrent neural networks†

Debjyoti Bhattacharya,

^a Devon C. Kleeblatt,^a Antonia Statt^b and Wesley F. Reinhart

*^ac

Author affiliations

* Corresponding authors

^a Materials Science and Engineering, Pennsylvania State University, University Park, PA 16802, USA
E-mail: reinhart@psu.edu

^b Materials Science and Engineering, Grainger College of Engineering, University of Illinois, Urbana-Champaign, IL 61801, USA

^c Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA 16802, USA

Abstract

Self-assembly of dilute sequence-defined macromolecules is a complex phenomenon in which the local arrangement of chemical moieties can lead to the formation of long-range structure. The dependence of this structure on the sequence necessarily implies that a mapping between the two exists, yet it has been difficult to model so far. Predicting the aggregation behavior of these macromolecules is challenging due to the lack of effective order parameters, a vast design space, inherent variability, and high computational costs associated with currently available simulation techniques. Here, we accurately predict the morphology of aggregates self-assembled from sequence-defined macromolecules using supervised machine learning. We find that regression models with implicit representation learning perform significantly better than those based on engineered features such as k-mer counting, and a recurrent-neural-network-based regressor performs the best out of nine model architectures we tested. Furthermore, we demonstrate the high-throughput screening of monomer sequences using the regression model to identify candidates for self-assembly into selected morphologies. Our strategy is shown to successfully identify multiple suitable sequences in every test we performed, so we hope the insights gained here can be extended to other increasingly complex design scenarios in the future, such as the design of sequences under polydispersity and at varying environmental conditions.

This article is part of the themed collection: Machine Learning and Artificial Intelligence: A cross-journal collection

Supplementary files

Article information

DOI: https://doi.org/10.1039/D2SM00452F
Article type: Paper
Submitted: 10 Apr 2022
Accepted: 15 Jun 2022
First published: 15 Jun 2022

Download Citation

Soft Matter, 2022,18, 5037-5051

Permissions

Request permissions

Predicting aggregate morphology of sequence-defined macromolecules with recurrent neural networks

D. Bhattacharya, D. C. Kleeblatt, A. Statt and W. F. Reinhart, Soft Matter, 2022, 18, 5037 DOI: 10.1039/D2SM00452F

To request permission to reproduce material from this article, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Soft Matter

Predicting aggregate morphology of sequence-defined macromolecules with recurrent neural networks†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Predicting aggregate morphology of sequence-defined macromolecules with recurrent neural networks

Social activity

Search articles by author

Spotlight

Advertisements