Feature engineering applied to intraoperative in vivo Raman spectroscopy sheds light on molecular processes in brain cancer: a retrospective study of 65 patients†
Abstract
Raman spectroscopy is a promising tool for neurosurgical guidance and cancer research. Quantitative analysis of the Raman signal from living tissues is, however, limited. Their molecular composition is convoluted and influenced by clinical factors, and access to data is limited. To ensure acceptance of this technology by clinicians and cancer scientists, we need to adapt the analytical methods to more closely model the Raman-generating process. Our objective is to use feature engineering to develop a new representation for spectral data specifically tailored for brain diagnosis that improves interpretability of the Raman signal while retaining enough information to accurately predict tissue content. The method consists of band fitting of Raman bands which consistently appear in the brain Raman literature, and the generation of new features representing the pairwise interaction between bands and the interaction between bands and patient age. Our technique was applied to a dataset of 547 in situ Raman spectra from 65 patients undergoing glioma resection. It showed superior predictive capacities to a principal component analysis dimensionality reduction. After analysis through a Bayesian framework, we were able to identify the oncogenic processes that characterize glioma: increased nucleic acid content, overexpression of type IV collagen and shift in the primary metabolic engine. Our results demonstrate how this mathematical transformation of the Raman signal allows the first biological, statistically robust analysis of in vivo Raman spectra from brain tissue.