Infrared spectroscopy with multivariate analysis segregates low-grade cervical cytology based on likelihood to regress, remain static or progress†
Abstract
Cervical cancer is the 2nd most common female cancer worldwide. However, in the developed world, cervical screening has reduced this cancer burden. Most smear referrals are low-grade, requiring continuous monitoring until they regress. Others need monitoring for static disease, while a few require treatment due to persistent low-grade or progressive disease. The ‘Holy Grail’ in cervical screening is predicting which patient is likely to have progressive disease. Fourier-transform infrared (FTIR) spectroscopy exploits the fact that an infrared (IR) spectrum represents a “biochemical-cell fingerprint”, which can be obtained from a cellular specimen based on a wavenumber-dependent absorption band pattern of constituents' vibrating chemical bonds. Low-grade (CIN1) specimens (n = 67) diagnosed on cytology were analysed using IR spectroscopy. The n = 67 study participants were rescreened by conventional cytology after a year whereupon three showed progressive disease and 31 had persistent low-grade atypia; 33 had regressed. Spectra from the initial cytology samples were then analysed using principal component analysis (PCA) with output (10 principal components) being inputted into linear discriminant analysis (LDA) to predict which samples would progress, remain static or regress; this approach was compared with variable selection techniques, namely the successive projection algorithm (SPA) and genetic algorithm (GA). Significant wavenumbers distinguishing regressive vs. static disease were 1736 cm−1, 1680 cm−1, 1512 cm−1, 1234 cm−1, 1099 cm−1 and 968 cm−1; separating the two categories is difficult due to a significant degree of ‘overlap’. Progressive disease can be significantly differentiated from static disease based on wavenumbers 1662 cm−1, 1648 cm−1, 1628 cm−1, 1512 cm−1, 1474 cm−1 and 965 cm−1; it can be segregated from regressive disease with 1686 cm−1, 1674 cm−1, 1625 cm−1, 1561 cm−1, 1525 cm−1 and 1310 cm−1. The GA–LDA model shows good separation for all categories (i.e., regressive vs. static vs. progressive disease) using 35 wavenumbers. An ability to predict progressive disease will reduce the need for repeat smears every six months whilst allowing early identification of patients who require treatment.