Database for liquid phase diffusion coefficients at infinite dilution at 298 K and matrix completion methods for their prediction†
Abstract
Experimental data on diffusion in binary liquid mixtures at 298 ± 1 K from the literature were systematically consolidated and used to determine diffusion coefficients D∞ij of solutes i at infinite dilution in solvents j in a consistent manner. The resulting database comprises basically all data on D∞ij at 298 K that are available and includes 353 points, covering 208 solutes and 51 solvents. In a first step, the new database was used to evaluate semiempirical methods for predicting D∞ij from the literature, namely the methods of Wilke and Chang, Reddy and Doraiswamy, Tyn and Calus, and SEGWE, of which SEGWE yielded the best results. Furthermore, a new method for the prediction of D∞ij based on the concept of matrix completion from machine learning was developed, which exploits the fact that experimental data for D∞ij can be represented as elements of a sparse matrix with rows and columns corresponding to the solutes i and solvents j; it is demonstrated that matrix completion methods (MCMs) can be used for closing the gaps in this matrix. Three variants of this approach were studied here, a purely data-driven MCM and two hybrid MCMs, which use information from SEGWE together with the experimental data. The methods were evaluated using the new database. The hybrid MCMs outperform both the data-driven MCM and all established semiempirical models in terms of predictive accuracy.