Bayesian optimization of nanoporous materials†
Abstract
Nanoporous materials (NPMs) could be used to store, capture, and sense many different gases. Given an adsorption task, we often wish to search a library of NPMs for the one with the optimal adsorption property. The high cost of NPM synthesis and gas adsorption measurements, whether these experiments are in the lab or in a simulation, often precludes exhaustive search. We explain, demonstrate, and advocate Bayesian optimization (BO) to actively search for the optimal NPM in a library of NPMs—and find it using the fewest experiments. The two ingredients of BO are a surrogate model and an acquisition function. The surrogate model is a probabilistic model reflecting our beliefs about the NPM-structure–property relationship based on observations from past experiments. The acquisition function uses the surrogate model to score each NPM according to the utility of picking it for the next experiment. It balances two competing goals: (a) exploitation of our current approximation of the structure–property relationship to pick the NPM we believe [under uncertainty] will be the highest-performing, and (b) exploration of regions of NPM space we have not visited, to pick an NPM we are uncertain about and improve our approximation of the structure–property relationship. We demonstrate BO by searching an open database of ∼70 000 hypothetical covalent organic frameworks (COFs) for the COF with the highest simulated methane deliverable capacity (pertinent for vehicular adsorbed natural gas storage). BO finds the optimal COF and acquires ∼30% of the top 100 highest-ranked COFs after evaluating only ∼140 COFs. More, BO searches more efficiently than evolutionary and one-shot supervised machine learning approaches.