Solid-state synthesizability predictions using positive-unlabeled learning from human-curated literature data

Abstract

The rate of materials discovery is limited by the experimental validation of promising candidate materials generated from high-throughput calculations. Although data-driven approaches, utilizing text-mined datasets, have shown some success in aiding synthesis planning and synthesizability prediction, they are limited by the quality of the underlying datasets. In this study, synthesis information of 4103 ternary oxides was extracted from the literature, including whether the oxide has been synthesized via solid-state reaction and the associated reaction conditions. This dataset provides an opportunity to supplement existing solid-state reaction models via reliable data and information from articles whose content and formats are challenging to extract automatically. A simple screening using this dataset identified 156 outliers from a subset of a text-mined dataset that contains 4800 entries, of which only 15% of the outliers were extracted correctly. Finally, this dataset was used to train a positive-unlabeled learning model to predict the solid-state synthesizability of new ternary oxides, where we predict 134 out of 4312 hypothetical compositions are likely to be synthesizable.

Graphical abstract: Solid-state synthesizability predictions using positive-unlabeled learning from human-curated literature data

Supplementary files

Transparent peer review

To support increased transparency, we offer authors the option to publish the peer review history alongside their article.

View this article’s peer review history

Article information

Article type
Paper
Submitted
16 Feb 2025
Accepted
16 Jul 2025
First published
19 Jul 2025
This article is Open Access
Creative Commons BY license

Digital Discovery, 2025, Advance Article

Solid-state synthesizability predictions using positive-unlabeled learning from human-curated literature data

V. Chung, A. Walsh and D. J. Payne, Digital Discovery, 2025, Advance Article , DOI: 10.1039/D5DD00065C

This article is licensed under a Creative Commons Attribution 3.0 Unported Licence. You can use material from this article in other publications without requesting further permissions from the RSC, provided that the correct acknowledgement is given.

Read more about how to correctly acknowledge RSC content.

Social activity

Spotlight

Advertisements