Young Woo Kima,
Hee-Jin Yua,
Jung-Sun Kimb,
Jinyong Hac,
Jongeun Choi*a and
Joon Sang Lee*a
aDepartment of Mechanical Engineering, Yonsei University, Korea. E-mail: joonlee@yonsei.ac.kr; jongeunchoi@yonsei.ac.kr
bDivision of Cardiology, Severance Cardiovascular Hospital, Yonsei University College of Medicine, Korea
cDepartment of Electrical Engineering, Sejong University, Korea
First published on 24th January 2020
A two-step machine learning (ML) algorithm for estimating both fractional flow reserve (FFR) and decision (DEC) for the coronary artery is introduced in this study. The primary purpose of this model is to suggest the possibility of ML-based FFR to be more accurate than the FFR calculation technique based on a computational fluid dynamics (CFD) method. For this purpose, a two-step ML algorithm that considers the flow characteristics and biometric features as input features of the ML model is designed. The first step of the algorithm is based on the Gaussian progress regression model and is trained by a synthetic model using CFD analysis. The second step of the algorithm is based on a support vector machine with patient data, including flow characteristics and biometric features. Consequently, the accuracy of the FFR estimated from the first step of the algorithm was similar to that of the CFD-based method, while the accuracy of DEC in the second step was improved. This improvement in accuracy was analyzed using flow characteristics and biometric features.
To overcome these problems, an FFR calculation technique based on computational fluid dynamics (CFD) has been developed. The CFD-based FFR calculation technique, or CFD-FFR, can avoid some invasive procedures by using geometric information from computed tomography (CT) or optical coherence tomography (OCT) images, along with the estimated or assumed boundary conditions. According to Coenen et al., the accuracy of FFRCFD is ∼80% with a sensitivity of 87.5% and a specificity of 67.5%;5 however, it requires expensive computational resources and a computational time of >8 h.6
Recently, to overcome the limitations of FFRCFD, machine learning (ML) methods have been studied. Compared with the CFD method, ML can perform calculations in few minutes using lesser computational resources. Researchers from various medical fields are adopting ML for diagnosis and disease prediction. For example, Tripathy et al. performed a study on classification of breast cancer using cellular images with ML algorithm.7 Moreover, Khanmohammadi et al. attempted to apply ML algorithm for diagnosing basal cell carcinoma via blood sample.8
Certain studies related to cell rheology such as the study by Kihm et al. used ML to classify cell types in blood.9 Furthermore, attempts have been made to apply ML to estimate FFR. Kim et al. trained an ML model using intravascular ultrasound (IVUS) images to predict FFR and achieved an accuracy of 81%.10 The limitations of their study were that the process was tedious and the segmentation of the IVUS image step was manual, which made it difficult to further increase the input data for the ML training. Another approach to address the lack of ML training data is to use a synthetic model. For example, Itu et al. generated a synthetic model of the circulatory system for ML training;11 the model was generated by randomly extracting the characteristics of patient data. Then, CFD was used to calculate the FFR from these composite models. Compared with patient data, synthetic models can be infinitely amplified. Furthermore, because the features can be easily controlled, the uniformity of the data can be enhanced. Consequently, the ML model could estimate the FFR with the same accuracy as FFRCFD; however, this indicates that the accuracy of ML FFR is limited to that of FFRCFD, which is 83% at maximum.
Tesche et al. and Hu et al. attempted to use synthetic models with other ML models to achieve an accuracy similar to that of the FFRCFD.12,13 Using ML-based FFR to increase the computational speed with accuracy constraints is an ineffective approach. The advantage of ML is the possibility of considering various features and determining their relationships. By maximizing the quality and quantity of the input features, the accuracy of ML-based FFR could surpass that of FFRCFD.
In this study, a two-step ML algorithm for estimating both FFR and decision (DEC) is introduced. The primary purpose of this model is to suggest the possibility of ML-based FFR to be more accurate than FFRCFD. For this purpose, a two-step ML algorithm that considers flow characteristics and biometric features as input features of ML is designed. Flow characteristics are the primary cause of pressure drop; thus, FFR is affected by both stenosis severity and flow characteristics. The relationship between the geometric features and FFR has been analyzed based on flow characteristics such as vorticity or turbulence intensity.14,15 By providing flow characteristics as input parameters, the ML algorithm can have more information for increasing its accuracy.
Furthermore, regardless of the geometric features, biometric features can affect FFR. The limitation of CFD is the absence of a method that considers biometric features. Various attempts have been made to reflect biometric features, such as age or body mass index (BMI), in CFD;16–18 however, these models are based on various assumptions and empirical equations that result in low accuracy. If these features can be analyzed using ML, the accuracy of ML-based FFR could surpass that of FFRCFD. In this study, a two-step ML algorithm was developed to efficiently handle the flow characteristics and biometric features.
The summary of the algorithm process is shown in Fig. 1. This algorithm separately provides both estimated FFR and DEC. In the first step, the Gaussian progress regression (GPR) model is used to calculate FFRGPR. In the second step, support vector machine (SVM) is used to calculate DECSVM.19 The GPR model is trained from the CFD results of the synthetic model; therefore, the target accuracy of the FFRGPR is the same as that of the FFRCFD. However, the SVM model is trained by both FFRGPR and flow characteristics and biometric features; therefore, the target accuracy of DECSVM should be higher than those of FFRGPR or FFRCFD.
This study focuses on categorizing and analyzing the mismatched cases of the two-step ML algorithm. Mismatch is defined as the wrong estimation of either FFR or DEC compared to FFREXP or DECEXP. To prove the need for the additional features, the flow characteristics and biometric features are analyzed for these mismatched cases.
(1) |
(2) |
(3) |
(4) |
In eqn (1), the density distribution function fi(x,t) indicates the proportion of particles moving with the i-th lattice velocity at lattice site x and time t; Δt is the time step; τ is the particle relaxation time; ei is the discrete microscopic velocity; fi is the local equilibrium distribution function; and is the speed of sound with c = (Δx/Δt). The fluid density ρ and velocity u can be calculated using the following formula:
(5) |
The kinematic viscosity of plasma is given as follows:
(6) |
Moreover, the local shear stress and local dynamic viscosity can be calculated as follows:
(7) |
(8) |
The inlet boundary condition of the simulation was given as the pulsatile pressure inlet, with a maximum pressure and minimum pressure were obtained from the patient information. Also, for the outlet boundary condition, the Windkessel model was used to reflect the compliance of the blood vessel, which was estimated from the height, BMI, and age of the patients. Note that for the synthetic models, the inlet and outlet parameters are randomized.
For ML training, geometric features are extracted from the OCT-CT fusion image. The extracted features are diameter; length; the curvatures of the proximal, central, and distal segments of the lumen; and the cross section eccentricity. Table 1 lists the geometric features and their average, minimum, and maximum values.
Average | Minimum | Maximum | |
---|---|---|---|
Total length (mm) | 3.8 | 1.7 | 7.3 |
Proximal area (mm2) | 0.81 | 0.54 | 0.94 |
Center area (mm2) | 0.73 | 0.48 | 0.97 |
Distal area (mm2) | 0.76 | 0.53 | 0.94 |
Total curvature (degree) | 21 | 0 | 90 |
Cross section eccentricity (%) | 31 | 4 | 78 |
Also, the synthetic vessel model is generated to amplify the quantity of data required for training the ML algorithm. The synthetic model is generated using the same geometric features extracted from the OCT-CT fusion image. Each value was randomized within a range of maximum/minimum values from patient data. Moreover, the biometric values used in fluid or boundary conditions were randomized. Table 1 lists the exact range of values of each parameter. The synthetic models are used to train the first step of our two-step algorithm.
GPR is able to statistically model and predict an arbitrary smooth function even with small number of observations.27–30 A Gaussian progress (GP) is a set of random variables which have a joint Gaussian distribution for any finite number of them. If {f(x), x ∈ Rd} is a GP, then given n observations x1, x2,…,xn, the joint distribution of the random variables f(x1), f(x2),…, f(xn) is Gaussian. A GP is defined by its mean function m(x) and covariance function k(x,x′), which becomes:
E(f(x)) = m(x) | (9) |
E[{f(x) − m(x)}{f(x′) − m(x′)}] = k(x,x′) | (10) |
From the model h(x)Tβ + f(x), where f(x) are from a zero mean GP with covariance function, or f(x) ∼ GP(0,k(x,x′)). h(x) are a set of basis functions that transform the original feature vector x in Rd into a new feature vector h(x) in Rp. β is a p-by-1 vector of basis function coefficients. This model represents a GPR model. An instance of response y can be modeled as
P(yi|f(xi),xi) ∼ N(yi|h(xi)Tβ + f(xi), σ2 | (11) |
SVM is a widely used classifier that uses supervised machine learning methods.31,32 The purpose of the SVM is to construct an optimal hyperplane that separates the sample into its maximum margins. The SVM handles the classification of nonlinear data by nonlinear mapping the input space to the higher dimensional feature space using the appropriate kernel.33
The major advantage of SVM is that it guarantees the global optimality.34 Let there be N data points where is the ith feature vector, and yi ∈ {−1, +1} is ith class label. Then, the hyperplane decision function f(x) = sgn((wTx) + b), where w is a weight vector and b is a bias, can be expressed as
(12) |
(13) |
Features used in this study consist of biometric features including age, BMI, vessel calcification, etc. and dynamic features related to flow characteristics including vorticity, helicity, OSI, etc. Each flow characteristic has 11 points along the direction of the length of the vessel. Using these points, dynamic features were made by considering raw data, max/min value, max/min index, differences between points, derivatives, max/min value/index of derivatives, and area under the curve. A set of 94 features is extracted from each of the six flow characteristics and 14 demographic features are added to create a total of 580 features.
Also, before training SVM model, feature selection is performed. Feature selection is an essential technique in machine learning. Highly correlated, irrelevant features increase operation time and computational load and have a negative impact on performance. Feature selection techniques can be used to prevent overfitting and to improve model performance with minimizing variance and maximizing model the generalizability of the model. In this paper, Boruta is employed as the feature selection method. Boruta is an all-relevant feature selection method and one of the wrapper algorithms on the Random Forest.35 It works through the following procedure:
(1) Add copies of all features to data set and shuffle them (which are called shadow features.)
(2) Train the Random forest classifier for extended data set and gather the feature importance scores that are Z scores.
(3) Check the importance of real features by comparing the Z scores of real features to the maximum Z score of the shadow feature and remove real features with lower Z scores.
(4) Repeat the process until the importance is assigned to all features, or until the algorithm reaches a specifically set limit for the Random forest runs.
The result of the Boruta algorithm is to divide the features into confirmed and rejected.
Number | Patient ID | FFREXP | DECEXP | FFRCFD | FFRGPR | DECSVM | Category |
---|---|---|---|---|---|---|---|
1 | F155 | 0.38 | 1 | 0.722 | 0.723 | 1 | 1 (matched) |
2 | F187 | 0.53 | 1 | 0.624 | 0.622 | 1 | |
3 | F172 | 0.71 | 1 | 0.767 | 0.773 | 1 | |
4 | F200 | 0.78 | 1 | 0.696 | 0.701 | 1 | |
5 | F134 | 0.79 | 1 | 0.704 | 0.698 | 1 | |
6 | F194 | 0.85 | 0 | 0.842 | 0.838 | 0 | |
7 | F87 | 0.86 | 0 | 0.904 | 0.906 | 0 | |
8 | F133 | 0.87 | 0 | 0.823 | 0.819 | 0 | |
9 | F18 | 0.9 | 0 | 0.901 | 0.901 | 0 | |
10 | F176 | 0.91 | 0 | 0.847 | 0.844 | 0 | |
11 | F201 | 0.94 | 0 | 0.926 | 0.928 | 0 | |
12 | F152 | 0.88 | 0 | 0.752 | 0.745 | 0 | 2 (only SVM matched) |
13 | F188 | 0.88 | 0 | 0.782 | 0.784 | 0 | |
14 | F159 | 0.90 | 0 | 0.759 | 0.763 | 0 | |
15 | F198 | 0.77 | 1 | 0.789 | 0.8140 | 1 | |
16 | F178 | 0.79 | 1 | 0.799 | 0.7942 | 0 | 3 (only GPR matched) |
17 | F163 | 0.78 | 1 | 0.799 | 0.7920 | 0 | |
18 | F136 | 0.6 | 1 | 0.86 | 0.8537 | 0 | 4 (mismatched) |
19 | F116 | 0.77 | 1 | 0.829 | 0.8291 | 0 | |
20 | F168 | 0.94 | 0 | 0.760 | 0.7512 | 1 |
FFRCFD | FFRGPR | DECSVM | |
---|---|---|---|
Accuracy | 65 | 65 | 75 |
Sensitivity | 70 | 70 | 50 |
Specificity | 60 | 60 | 80 |
PPV | 75 | 75 | 83 |
NPV | 25 | 25 | 64 |
The performance of FFRGPR was not very different from that of FFRCFD. However, the accuracy of DECSVM was slightly higher than that of FFRGPR with higher specificity but lower sensitivity. Furthermore, both the PPV and NPV of DECSVM were higher than those of FFRGPR.
Fig. 4 shows the error percentage with data index aligned by FFRGPR. The error value was calculated by the difference between FFREXP and FFRGPR. The average error percentage was 27.46% when FFREXP < 0.75 (4 cases), 3.80% when 0.75 ≤ FFREXP < 0.85 (6 cases), and 12.70% when FFREXP ≥ 0.85 (10 cases).
Fig. 4 Error percentage by data index aligned by FFREXP. The average error percentage was 27.46% when FFREXP < 0.75, 3.80% when 0.75 ≤ FFREXP < 0.85, and 12.70% when FFREXP ≥ 0.85. |
Because the decision borderline, or where DECEXP value changes, was at FFREXP = 0.8, the most important region was when 0.75 ≤ FFREXP < 0.85. Note that the error percentage was the lowest in this region. Fig. 5 shows the relative weight factor of each features when the SVM model was trained. A higher weight factor indicates that the DECSVM is more affected by that feature.
Fig. 7 shows pressure, vorticity, and WSS of 20 cases separated by four categories. Moreover, it shows the contours of pressure, vorticity, and WSS with sample cases from each category. In Fig. 7, it is important to note the range of values and not only the average value. If the range of a feature in category 2 or 3 is similar to that in categories 1 and 4, it indicates that the feature does not contribute to the mismatch. However, if this range is smaller in category 2 or 3, it indicates that the feature might be related to the mismatch and additional investigation is required. Fig. 8 shows that the FFREXP, vorticity, and OSI showed a smaller range in categories 2 and 3 compared with category 1.
There are still a few limitations that should be further studied to achieve a better model. The most critical limitation is that the quality and quantity of biometric features might not be enough. For example, while age or calcification is used to represent vessel stiffness, the exact vessel stiffness cannot be obtained unless it is measured directly. However, it is almost impossible to measure every FFR-related feature from every patient. Even the synthetic models used for amplifying the cases are not helpful for solving this problem because the features estimable by synthetic models are limited to CFD.
Also, the category 3 cases in the performance test result should be further analyzed and be eliminated. Category 3 means that DECSVM has poorly guessed even after the correct estimation of FFRGPR. The cause of these errors are probably due to the overfitting problem of the SVM algorithm, but the cases used for performance test are yet not enough to fully analyze and correct this error. In future works, solving this problem and reducing category 3 cases is going to be the main objectives to improve the algorithm.
Regardless of these limitations, it was confirmed that the accuracy of the ML algorithm can surpass that of CFD. The ultimate goal of ML-based FFR should not only be a reduced calculation time but also obtaining a sufficiently high accuracy and practicality so as to replace the FFREXP.
This journal is © The Royal Society of Chemistry 2020 |