A high-throughput computational dataset of halide perovskite alloys†
Abstract
Novel halide perovskites with improved stability and optoelectronic properties can be designed via composition engineering at cation and/or anion sites. Data-driven methods, especially involving high-throughput first principles computations and subsequent analysis based on unique materials descriptors, are key to achieving this goal. In this work, we report a density functional theory (DFT) dataset of 495 ABX3 halide perovskite compounds, with monovalent organic or inorganic cations as A, divalent Group 2 or Group 14 elements as B, and I, Br, or Cl as X, and different amounts of mixing applied at each site using the special quasirandom structures (SQS) approach. We perform GGA-PBE calculations on all 495 pseudo-cubic perovskite structures and between 250 and 300 calculations each using the more expensive HSE06 functional, with and without spin–orbit coupling, both including full geometry optimization and static calculations on PBE optimized structures. Lattice parameters, decomposition energy, band gap, and theoretical photovoltaic efficiency derived from computed optical absorption spectra, are determined from each level of theory, and some comparisons are made with collected experimental values. Trends in the data are unraveled in terms of the effects of mixing at different sites, fractions of specific elemental or molecular species present in the compound, and averaged physical properties of species at different sites. We perform screening across the perovskite dataset based on multiple known definitions of stability factors, deviation from cubicity in the optimization cell, and computed stability and optoelectronic properties, leading to a list of promising compositions as well as design principles for achieving multiple desired properties. Our multi-objective, multi-fidelity, computational halide perovskite alloy dataset, one of the most comprehensive to date, is available open-source, and currently being used to train predictive and optimization models for accelerating the design of novel compositions for superior performance across many optoelectronic applications.