Combinatorial discovery of antibacterials via a feature-fusion based machine learning workflow†
Abstract
The discovery of new antibacterials within the vast chemical space is crucial in combating drug-resistant bacteria such as methicillin-resistant Staphylococcus aureus (MRSA). However, the traditional approach of screening the entire chemical library in an ergodic manner can be laborious and time-consuming. Machine learning-assisted screening of antibacterials alleviates the exploration effort but suffers from the lack of reliable and related datasets. To address these challenges, we devised a combinatorial library comprising over 110 000 candidates based on the Ugi reaction. A focused library was subsequently generated through uniform sampling of the entire library to narrow down the preliminary screening scale. A novel feature-fusion architecture called the latent space constraint neural network was developed which incorporated both fingerprint and physicochemical molecular descriptors to predict the antibacterial properties. This integration allowed the model to leverage the complementary information provided by these descriptors and improve the accuracy of predictions. Three lead compounds that demonstrated excellent efficacy against MRSA while alleviating drug resistance were identified. This workflow highlights the integration of machine learning with the combinatorial chemical library to expedite high-quality data collection and extensive data mining for antibacterial screening.