Soft classification of single samples based on multi-analyte spectra†
Abstract
Chemical fingerprinting based on multi-analyte spectra can be very powerful. An example is the analysis and classification of forensic documents and artworks by means of the plume fluorescence spectra of the pigments. For that purpose, we borrow the concept of Gaussian naïve Bayes classifier and develop a soft classification scheme to estimate the class membership probabilities. It is based on the similarity of the unknown sample to the training observations, as measured by the radial basis function kernel in the class space of orthogonal partial-least-squares discriminant analysis. We apply the scheme to the classification of plume fluorescence spectra of chinese red seal inks. We compare its performance against that of a conventional hard classification scheme. Our scheme gives 98.9% sensitivity, 99.8% specificity, and zero false in-class rate; all better than those of the hard scheme. More importantly, our scheme reports class membership probabilities for each and every test sample. This is especially useful for sorting single samples. For example, we show that samples with assigned probabilities higher than 80% are sorted correctly 99.5% of the time. Their classification is therefore highly reliable. For samples with assigned probabilities below 80%, the correct sorting rate is only 82%. But these cases are few, less than 4% of the samples, and their relatively low membership probabilities still serve as flags for further sampling of the specimen.