Solid-state synthesizability predictions using positive-unlabeled learning from human-curated literature data

Vincent Chung; Aron Walsh; David J. Payne

doi:10.1039/D5DD00065C

Solid-state synthesizability predictions using positive-unlabeled learning from human-curated literature data†

Vincent Chung,

*^a Aron Walsh

^a and David J. Payne

^abc

Author affiliations

* Corresponding authors

^a Department of Materials, Imperial College London, South Kensington, London SW7 2AZ, UK
E-mail: vincent.chung15@imperial.ac.uk

^b Research Complex at Harwell, Harwell Science and Innovation Campus, Didcot, Oxfordshire OX11 0FA, UK

^c NEOM Education, Research, and Innovation Foundation, Al Khuraybah, Tabuk 49643-9136, Saudi Arabia

Abstract

The rate of materials discovery is limited by the experimental validation of promising candidate materials generated from high-throughput calculations. Although data-driven approaches, utilizing text-mined datasets, have shown some success in aiding synthesis planning and synthesizability prediction, they are limited by the quality of the underlying datasets. In this study, synthesis information of 4103 ternary oxides was extracted from the literature, including whether the oxide has been synthesized via solid-state reaction and the associated reaction conditions. This dataset provides an opportunity to supplement existing solid-state reaction models via reliable data and information from articles whose content and formats are challenging to extract automatically. A simple screening using this dataset identified 156 outliers from a subset of a text-mined dataset that contains 4800 entries, of which only 15% of the outliers were extracted correctly. Finally, this dataset was used to train a positive-unlabeled learning model to predict the solid-state synthesizability of new ternary oxides, where we predict 134 out of 4312 hypothetical compositions are likely to be synthesizable.

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

DOI: https://doi.org/10.1039/D5DD00065C
Article type: Paper
Submitted: 16 Feb 2025
Accepted: 16 Jul 2025
First published: 19 Jul 2025
This article is Open Access

Download Citation

Digital Discovery, 2025, Advance Article

Permissions

Request permissions

Solid-state synthesizability predictions using positive-unlabeled learning from human-curated literature data

V. Chung, A. Walsh and D. J. Payne, Digital Discovery, 2025, Advance Article , DOI: 10.1039/D5DD00065C

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Digital Discovery

Solid-state synthesizability predictions using positive-unlabeled learning from human-curated literature data†

Abstract

Supplementary files

Transparent peer review

Article information

Download Citation

Permissions

Solid-state synthesizability predictions using positive-unlabeled learning from human-curated literature data

Social activity

Search articles by author

Spotlight

Advertisements