Multimodal learning in synthetic chemistry applications: gas chromatography retention time prediction and isomer separation optimization
Abstract
Multimodal learning, a key machine learning (ML) approach, has been extensively applied in fields such as medical diagnostics and recommendation systems. The complexity of chemical data offers unique opportunities for multimodal learning, though its application in chemistry remains underexplored. Here, we propose an innovative multimodal framework for gas chromatography (GC) that integrates a geometry-enhanced graph isomorphism network and gated recurrent units. This framework predicts GC retention time across diverse molecular heating profiles with a test set R2 of 0.995, outperforming traditional ML methods. It effectively recommends optimal chromatographic conditions for separating positional isomers and cis/trans isomers, minimizing experimental iterations and significantly improving analytical efficiency. Moreover, the model provides insights into the separation challenges of various isomers, enhancing understanding of the relationship between molecular structure and chromatographic behavior. This approach could pave the way for broader applications of multimodal learning in chemistry.