A machine learning approach toward generating the focused molecule library targeting CAG repeat DNA†
Abstract
This study reports a machine learning-based classification approach with surface plasmon resonance (SPR) labeled data to generate a focused molecule library targeting CAG repeat DNA. By using an SPR screening and a machine learning classification model, we can improve the identification process of elucidating new hit compounds for the next round of wet lab experiments. The reported model increased the probability of hits from 5.2% to 20.6% in a focused molecule library with 92.9% correct hit classification (recall) and 99.3% precision for the non-hit class.