Yao Pengab,
Gang Liab,
Mei Zhouc,
Huaile Wangab and
Ling Lin*ab
aState Key Laboratory of Precision Measurement Technology and Instruments, Tianjin University, Tianjin 300072, China. E-mail: linling@tju.edu.cn
bTianjin Key Laboratory of Biomedical Detecting Techniques & Instruments, Tianjin University, Tianjin 300072, China
cShanghai Key Laboratory of Multidimensional Information Processing, East China Normal University, Shanghai 200241, China
First published on 13th February 2017
Dynamic spectrum (DS) has major significance in the non-invasive measurement of blood components. Effective dynamic spectrum extraction methods can enhance the signal to noise ratio (SNR) of dynamic spectrum data and improve the non-invasive measurement accuracy. According to the principle of DS and the characteristics of the photoplethysmogram (PPG), in this paper, a new dynamic spectrum extraction method based on independent component analysis (ICA) combined with dual-tree complex wavelet transform (DTCWT) is proposed. The core of this method is to find out the closest ratio between each PPG signal at one wavelength and the template signal of PPGs which represents original blood pulsating information best as DS data. In order to testify the effectiveness of the new proposed method, experiments for the determination of hemoglobin concentration of 151 volunteers based on three different kinds of DS extraction methods coupled with partial least squares (PLS) were conducted respectively, where the root mean square error of the calibration (RMSEC), root mean square error of prediction (RMSEP), correlation coefficient of calibration (Rc) and correlation coefficient of prediction (Rp) were used as the evaluation index of the prediction performance. Compared with the other two famous DS extraction methods, frequency domain analysis and single trial estimation, ICA combined DTCWT showed better prediction ability. The forecast accuracy of the new method ICA combined DTCWT reached 90.62%, while commonly used frequency domain analysis and single trial estimation was 63.71% and 78.83%, respectively. The results show comprehensively that the dynamic spectrum extraction method based on ICA combined DTCWT is more reliable and accurate, which opens up avenues for the non-invasive study of the dynamic spectrum.
NIR spectroscopy is a non-invasive optical technique to monitor tissue oxygenation. It is currently used for the detection of blood glucose, blood oxygen, and hemoglobin concentration detection. However, the individual discrepancy and measuring condition can affect the measured spectrum, which will significantly affect the accuracy of hemoglobin concentration. Jeon13 proposed to establish the calibration model by the measured data to analyze hemoglobin concentration in 2002. Xingdan Chen14 proposed an effective signal extraction method based on subtraction of blood volume difference to obtain the effective spectral information which directly reflected the blood components. Gang Li15,16 proposed dynamic spectrum (DS) theory which extracted the absorption spectrum from pulse oximeter's photoplethysmogram (PPG) signals. The absorbance reflected the changes in blood information. Using the absorbance for modeling could eliminate individual differences of the measured subjects and the influence of measuring conditions more effectively. Compared with the two traditional theories, DS can obtain more accurate data. DS theory studies on the relationship between spectral changes and the PPG signals, and then predicts the concentration of the blood components by modeling. It has achieved a series of results on spectrum extraction, signal acquisition and clinical trials, etc.17
DS demands spectral data of high quality, thus it requires the measurement accuracy of spectral data to be as high as possible. Frequency domain analysis18 and single trial estimation19 are two famous extraction methods for DS currently. Frequency domain analysis extracts the amplitude of fundamental of the logarithmic PPG by Fast Fourier Transform (FFT), but its accuracy is low because the PPG is not a single-frequency signal. As we all know, if a signal has a certain level of total energy, the wider the frequency spectrum width, the weaker the amplitude of fundamental. Single trial estimation has a strong anti-noise ability due to the elimination of abnormal data and the error correction by line-fitting. It is possible to increase the SNR of the original PPG signal by eliminating the anomalous periods, at the same time, it is possible to improve the SNR of DS at each wavelength by averaging the DS data of each period. However, the disadvantage of this method is that the implementation process is more cumbersome for a large number of data samples, and this method still has not achieved the desired results. In addition, Butterworth filter is the main PPG signals pre-processing method for both frequency domain analysis and single trial estimation, but it is very easy to cause distortion due to the long transition zone.
In connection with time and frequency domain characteristics of PPG signals, this paper proposes a new DS extraction method by combining ICA and DTCWT. It integrates blind source separation of ICA and denoising characteristics of DTCWT to improve the accuracy of blood component measurement by DS. Firstly, pre-process the PPG signals by DTCWT aimed at removing the noise interference. Secondly, calculate the template signal using the processed PPG signals. Then put the template and the PPG signal at each wavelength to do ICA processing, resulting in the signal which mostly approximates the clean signal distribution of the arterial blood volume changes. At last, figure out the closest ratio k between each generated PPG signal at each wavelength and the template signal as DS data. All proportions constitute new DS data. The processed PPG signals result using the new method showed more desirable effects. And the extracted DS can reflect accurate absorbance changes information more effectively. Furthermore, for modeling prediction of hemoglobin concentration, the forecast results of ICA combined DTCWT reached the highest accuracy compared with the frequency domain analysis and single trial estimation.
ODi = (∑aiC)Bd + G | (1) |
Fig. 1 The generation of the PPG and a simplified model of tissue. PPG is a simple optical method to detect blood volume changes in the microvascular bed of tissue.21 The PPG waveform consists of two parts: AC (pulsatile) and DC (baseline), so the tissue can be regarded as a combination of a pulsatile part and a “static” part. |
ODi is the absorbance during one cardiac cycle corresponding to the wavelength i. G is the loss caused by the background, B is the factor of path and ai represents specific extinction coefficient of different wavelengths. Then the changes of the absorbance at wavelength i is:
(2) |
In formula (2), Ioi is the intensity of the incident light. Iimax is the maximum output intensity while Iimin is the minimum output intensity. By extracting the alternating component of the logarithmic PPG (PPG signal after logarithmic transformation), the light absorption information of arterial pulsatile blood can be obtained. Δd is the path length of pulsating blood layer. i represents the mark of multiple wavelengths. The pulsating blood layer is very thin and light scattering effects can be ignored, so BΔd can be normalized to 1. Therefore, the concentration of the substance in the blood C can be determined by the modeling method.
X(t) = AS(t) | (3) |
Or expressed as a matrix form:
(4) |
Only observed signals X(t) is known in the information above. The role of ICA algorithm is used to evaluate mixed matrix A and independent source signals S(t) according to the observed signal, and the estimated value of source signals Y(t) can be solved by following formula:
Y(t) = WTX(t) | (5) |
Formula (5) is the solving model of ICA. W = [w1, w2,⋯, wn] in formula is the inverse matrix of mixed matrix A, and it is called separating matrix or unmixing matrix.
The basic idea of ICA model solution method is using some kind of objective function and optimization algorithm to measure the independence of separate random variables. According to the central limit theorem, the measurement of dependence degree among random signals can be done by non-Gaussianity measurement. At present, negentropy25 and kurtosis are used to measure non-Gaussianity of random variables. Considering better stability of negative entropy and large amounts of PPG signals, FastICA algorithm26 whose objective function is negative entropy has been selected to achieve PPG signals separation in this study.
ICA is a statistical method and it only separate signal having the greatest statistical independence to approximate the source signal waveform. It is noteworthy that compared with source signal, separation signal exists inconsistencies of magnitude and order. For inconsistencies of order, signals can be determined by spectral analysis correspondingly and it will not affect the overall results. But the amplitude of PPG signal reflects the important pulse information, so amplitude recovery is a more critical issue. In this respect, the amplitude of the signal was recovered by using adaptive filter, and for simplicity, the Least Mean Square (LMS)27 algorithm was selected:
y(n) = wT(n)u(n) | (6) |
e(n) = d(n) − y(n) | (7) |
w(n + 1) = w(n) + μe(n)u(n) | (8) |
Among them, u(n) represents filter input, d(n) represents the desired signal, w(n) is filter weight, e(n) shows error reduction through adaptive filter and μ represents the steps for weighting vector updating.
In Fig. 2, h0(m), h1(m) represents low-pass and high-pass filter pair for the first-layer decomposition of real tree of signal x(t), respectively. g0(m), g1(m) represents low-pass and high-pass filter pair for the first-layer decomposition of imaginary tree of signal x(t), respectively. There is one sample interval delay between real tree and imaginary tree at the first layer of decomposition, so the data extracted from real tree is what imaginary tree has not extracted, that is to say they could form complementary relationships. For the decomposition of following layer, the sum of the delay difference generated by two trees in the layer and all the previous layers is a sampling period relative to the original input. The phase-frequency response of the two-tree filter should have a group delay of half a sampling period, and the amplitude-frequency response of the two-tree filter is equal. In this case the loss of information can be effectively reduced. Therefore DTCWT has the approximate translation in-variance and could reduce the effect of translation disturbances of the signal.
Due to maximum a posteriori (MAP) estimation algorithm has the advantage of being able to optimize the loss function, MAP was adopted in this paper to determine the threshold for DTCWT high frequency coefficients processing. Morphology open operation or closed operation is a compound operation combined expansion with corrosion operation. Maragos constructed morphological open-close (OC) and morphological closed-open (CO) filters.30 A combinatorial filter by averaging the morphological open-closing filter and the morphological close-opening filter was used for DTCWT low frequency coefficients processing in this research. In morphology, structural elements play the role of the filter window, and the shape of it has a great impact on the filter effect. The shape of structural elements like circular, triangular, linear, parabolic, and sinusoidal and other polygons have been widely used. In different shapes of structural elements, the parabolic structure elements can effectively suppress the impulse noise. Disc-type structural elements have rotational invariance and high smoothness. Thus these two structural elements were selected to design a generalized morphological filter in this paper. Therefore, averaging the morphological open-closing filter and the morphological close-opening filter with structural elements were combined to eliminate low frequency interference. The algorithm is as follows, where f is the input signal, g1, g2 represents different structural elements, respectively. The symbol ○ and ● denotes the open and closing operations, respectively.
GOC(f(n)) = f○g1●g2 | (9) |
GCO(f(n)) = f●g1○g2 | (10) |
The output of the filter is:
y(n) = [GOC(f(n)) + GCO(f(n))]/2 | (11) |
The proposed block processing for the objective considered here can be explained using the flow chart represented in Fig. 3.
The specific steps of DS extraction method ICA combined DTCWT are as follows.
min|PPGs − kPPGI| = 0 | (12) |
Fig. 5 shows the PPG signal (20 s) of one volunteer. It's a time-varying signal which contains the transmission spectrum over a fixed wavelength range but at different moments. At the waveband of high signal intensity, the pulse wave can be identified clearly.
After the collection of spectral data, 218 subjects were measured, 151 of which were eventually selected for modeling and predicting. The general information of these volunteers is listed in Table 1. During the 151 samples, 129 of them were selected according to the“M + N” theory31 as the training set (calibration) to establish a calibration model while the other 22 samples composed the test set (prediction).
Gender | Max age | Min age | Average age | Sample size |
---|---|---|---|---|
Male | 81 | 24 | 47.48 | 91 |
Female | 74 | 24 | 45.44 | 60 |
Overall | 81 | 24 | 46.66 | 151 |
Firstly, the method of DTCWT was used to pre-process the recorded PPG signals. Secondly, DS data of these subjects were extracted by the new method. Last but not least, these DS data were used for modeling. PLS32,33 was used as the modeling method and the number of principal components was determined by cross validation in this research. The correlation coefficient of the calibration set Rc, the root mean square error of the calibration set RMSEC, the correlation coefficient of the prediction set Rp and the root mean square error of the prediction set RMSEP were considered as the evaluation index of the prediction performance. In order to verify this new DS extraction method, two commonly used DS extraction methods were used for comparison.
All data obtained from the experiments were gathered to format convert by using Avaspec software (version 76USB2) and transferred to MATLAB. All calculations were processed by MATLAB (version 2010 a).
Fig. 6 Pre-processed results of one subject: (a) original PPG signal. (b) Processed signal by Butterworth filter. (c) Processed signal by DTCWT. |
It is obvious that both of the two methods removed the high frequency noise well. However, compared with Butterworth filter, processed signal by DTCWT showed greater smoothness. And then the template signal was obtained by superposition and averaging. As shown in Fig. 7.
Fig. 7 DTCWT pre-processing results of one subject: (a) original PPG signal. (b) Processed signal by DTCWT. (c) The template signal. |
It is obvious that the template signal removed random error and a part of baseline drift, which will give a great help for the back of the extraction work.
It is obvious that ICA was able to make blind source separated from the subgraphs above. The required signal was separated with the useless part which was irrelevant to both PPG signal and the template signal. And for the required signal, the motion artifact (MA) was removed, so it would contain more adequate and stable information of the original signal. Then the required signal was processed by LMS adaptive filtering, and the output is shown as Fig. 9.
Fig. 9 The result of LMS adaptive filter processing: (a) the input of LMS adaptive filter. (b) The output of LMS adaptive filter. |
This graph clearly indicates that the amplitude of PPG correlation signal of outputs of ICA was recovered by LMS adaptive filter. In that way, the output of LMS adaptive filter will have the same order of magnitude with the original PPG.
Fig. 10 Comparison diagram of DS by three methods: (a) DS of frequency domain analysis. (b) DS of single trial estimation. (c) DS of ICA combined DTCWT. |
It is clearly shown that DS curve of frequency domain analysis had burr which represented the high-frequency interference was not removed completely. DS curve of single trial estimation was quite smooth, but there was still a little interference on low-frequency. And DS curve of ICA combined DTCWT is the smoothest and cleanest which mean that the DS data would express better information.
DS extraction method | Range (g L−1) | RMSEC (g L−1) | Rc | RMSEP (g L−1) | RP |
---|---|---|---|---|---|
Frequency domain analysis | 90–173 | 14.7799 | 0.8230 | 17.3135 | 0.6371 |
Single trial estimation | 90–173 | 10.6272 | 0.8578 | 12.1011 | 0.7883 |
ICA combined DTCWT | 90–173 | 10.2818 | 0.9389 | 8.6401 | 0.9062 |
Fig. 11 Prediction values of three extraction methods: (a) frequency domain analysis. (b) Single trial estimation. (c) ICA combined DTCWT. |
Table 2 showed that the correlation coefficient RP and the RMSEP value by using frequency domain analysis were 0.6371 and 17.3135. The correlation coefficient RP and the RMSEP value by using single trial estimation were 0.7883 and 12.1011. However, the correlation coefficient RP by using ICA combined DTCWT was the best, and the value reached 0.9062. Compared with the two methods, it was increased by 42.24% and 14.96%, respectively. Furthermore, the RMSEP value of the new method was 8.6401. Compared with the two methods, it was reduced by 50.10% and 28.60%, respectively. Obviously, the forecast result by using the new method worked best. Comparative prediction values of hemoglobin concentration by using different extraction methods are shown as in Fig. 11.
Fig. 11 clearly illustrates that the predicted value using ICA combined DTCWT had the minimum difference with the theoretical value. Namely, this method worked best on predicting. The above researches jointly demonstrate that the proposed new method could reach a relatively desired consequence for DS modeling forecast, which will promote the non-invasive measurement of blood components progress effectively and accurately.
This journal is © The Royal Society of Chemistry 2017 |