Extrapolation validation (EV): a universal validation method for mitigating machine learning extrapolation risk

Mengxian Yu; Yin-Ning Zhou; Qiang Wang; Fangyou Yan

doi:10.1039/D3DD00256J

You do not have JavaScript enabled. Please enable JavaScript to access the full features of the site or access our non-JavaScript page.

Extrapolation validation (EV): a universal validation method for mitigating machine learning extrapolation risk†

Mengxian Yu,^a Yin-Ning Zhou,

^b Qiang Wang^a and Fangyou Yan

*^a

Author affiliations

* Corresponding authors

^a School of Chemical Engineering and Material Science, Tianjin University of Science and Technology, Tianjin 300457, P. R. China
E-mail: yanfangyou@tust.edu.cn

^b Department of Chemical Engineering, School of Chemistry and Chemical Engineering, Shanghai Jiao Tong University, Shanghai 200240, P. R. China

Abstract

Machine learning (ML) can provide decision-making advice for major challenges in science and engineering, and its rapid development has led to advances in fields like chemistry & medicine, earth & life sciences, and communications & transportation. Grasping the trustworthiness of the decision-making advice given by ML models remains challenging, especially when applying them to samples outside the domain-of-application. Here, an untrustworthy application situation (i.e., complete extrapolation-failure) that would occur in models developed by ML methods involving tree algorithms is confirmed, and the root cause of its difficulty in discovering novel materials & chemicals is revealed. Furthermore, a universal extrapolation risk evaluation scheme, termed the extrapolation validation (EV) method, is proposed, which is not restricted to specific ML methods and model architecture in its applicability. The EV method quantitatively evaluates the extrapolation ability of 11 popularly applied ML methods and digitalizes the extrapolation risk arising from variations of the independent variables in each method. Meanwhile, the EV method provides insights and solutions for evaluating the reliability of out-of-distribution sample prediction and selecting trustworthy ML methods.

Download options Please wait...

Supplementary files

Supplementary information PDF (2680K)

Article information

DOI: https://doi.org/10.1039/D3DD00256J
Article type: Paper
Submitted: 29 Dec 2023
Accepted: 17 Apr 2024
First published: 19 Apr 2024
This article is Open Access

Download Citation

Digital Discovery, 2024,3, 1058-1067

Permissions

Request permissions

Extrapolation validation (EV): a universal validation method for mitigating machine learning extrapolation risk

M. Yu, Y. Zhou, Q. Wang and F. Yan, Digital Discovery, 2024, 3, 1058 DOI: 10.1039/D3DD00256J

This article is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported Licence. You can use material from this article in other publications, without requesting further permission from the RSC, provided that the correct acknowledgement is given and it is not used for commercial purposes.

To request permission to reproduce material from this article in a commercial publication, please go to the Copyright Clearance Center request page.

If you are an author contributing to an RSC publication, you do not need to request permission provided correct acknowledgement is given.

If you are the author of this article, you do not need to request permission to reproduce figures and diagrams provided correct acknowledgement is given. If you want to reproduce the whole article in a third-party commercial publication (excluding your thesis/dissertation for which permission is not required) please go to the Copyright Clearance Center request page.

Social activity

Fetching data from CrossRef.
This may take some time to load.

Digital Discovery

Extrapolation validation (EV): a universal validation method for mitigating machine learning extrapolation risk†

Abstract

Supplementary files

Article information

Download Citation

Permissions

Extrapolation validation (EV): a universal validation method for mitigating machine learning extrapolation risk

Social activity

Search articles by author

Spotlight

Advertisements