Machine-learning-guided prediction of photovoltaic performance of non-fullerene organic solar cells using novel molecular and structural descriptors†
Abstract
Recent developments in novel conjugated polymer donor and non-fullerene acceptor (NFA) materials with promising properties have led to an unprecedented increase in the power conversion efficiency (PCE) of organic solar cells (OSCs) by more than 19%. However, in this era of artificial intelligence, identifying highly potential combinations of such donor and acceptor materials using the current trial-and-error experimental approaches is certainly not feasible. Herein, we effectively predicted and screened the performance of OSCs based on various polymer:NFA combinations by employing a data-driven machine learning (ML) approach and successively validated this predictivity by fabricating a set of highly efficient devices with a PCE up to 15.23%. A dataset of 1242 experimentally verified donor : acceptor (D/A) combinations was constructed, and the corresponding material descriptors were generated to train and test five different supervised ML models. Using a unique combination of both frontier molecular orbital (FMO) and RDKit descriptors as input features, the random forest ML model performed best for predicting the PCE with a Pearson's coefficient (r) of 0.791 and a mean absolute percentage error of 2.004. On the other hand, the gradient-boosting ML model showed a substantially improved performance for the prediction of both JSC and VOC with high r values of 0.842 and 0.862, respectively. Furthermore, the importance of critical RDKit descriptors along with FMO descriptors in such performance predictions was realized by SHapley Additive exPlanations (SHAP) analyses. Therefore, the proposed ML framework guided by these new descriptors will indeed be fruitful for designing new molecules and screening and predicting suitable D/A combinations to accelerate the development of highly efficient OSCs.