Jinghan
Fan
ab,
Xiao
Wang
ab,
Yile
Yu
ab,
Yuze
Li
*c and
Zongxiu
Nie
*ab
aBeijing National Laboratory for Molecular Sciences, Key Laboratory of Analytical Chemistry for Living Biosystems, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, China. E-mail: znie@iccas.ac.cn; Fax: +86-10-62652123; Tel: +86-10-62652123
bUniversity of Chinese Academy of Sciences, Beijing 100049, China
cState Key Laboratory of High-efficiency Utilization of Coal and Green Chemical Engineering, College of Chemistry and Chemical Engineering, Ningxia University, Yinchuan 750021, China. E-mail: liyuze@nxu.edu.cn
First published on 23rd November 2022
Hepatocellular carcinoma (HCC) is one of the most common malignant tumors with a high mortality rate. The diagnosis of HCC is currently based on alpha-fetoprotein detection, imaging examinations, and liver biopsy, which are expensive or invasive. Here, we developed a cost-effective, time-saving, and painless method for the screening of HCC via machine learning based on atmospheric pressure glow discharge mass spectrometry (APGD-MS). Ninety urine samples from HCC patients and healthy control (HC) participants were analyzed. The relative quantification data were utilized to train machine learning models. Neural network was chosen as the best classifier with a classification accuracy of 94%. Besides, the levels of eleven urinary carbonyl metabolites were found to be significantly different between HCC and HC, including glycolic acid, pyroglutamic acid, acetic acid, etc. The possible reasons for the regulation were tentatively proposed. This method realizes the screening of HCC via potential urine metabolic biomarkers based on APGD-MS, bringing a hopeful point-of-care diagnosis of HCC in a patient-friendly manner.
Pathological examination is the gold standard of HCC diagnosis. Through the biopsy of liver tissue, HCC can be diagnosed precisely.3 However, there are inherent risks of the biopsy procedure like seeding and bleeding,4 so patients need to undergo the determination of the level of maternal serum alpha-fetoprotein (AFP) or have the imaging examination before biopsy to screen for the possibility of HCC in the first place.5 Nevertheless, AFP detection possesses a high false positive rate6,7 while imaging methods are limited by interobserver variability, side effects of contrast agents, and high cost. Therefore, a cost-effective, time-saving, and patient-friendly method for the screening of HCC is urgently needed.
Urine is the final product of the metabolic system and can be acquired readily.8 It has distinctive advantages of having a simple matrix, being highly accessible, basically sterile, and suitable for long-term storage in large volumes.9 In light of these, it has been a trend to use urine as samples in metabolomic research.10 Through studying the levels of small metabolites (such as carboxyl, aldehyde, and ketone) in the urine of HCC patients and HCs, samples could be identified as HCC or HC through the machine learning method.11,12
Mass spectrometry (MS) is a powerful analytical method since it is capable of providing molecular weight and structure information. Chromatography MS is preferred in analyses of small metabolites in body fluids because it has the superior features of high separation efficiency and strong quantitative ability. At the same time, there are still a few defects that cannot be ignored: sophisticated pretreatments are needed before analysis, operations are complicated, and the introduction of chromatography makes the analysis time relatively long, typically longer than 1 hour.13–15
Carboxyl compounds represent an important kind of metabolite in urine. About 36% of metabolites in urine contain one or more carboxyl groups. The detection of carboxylic compounds in urine, however, imposes strict requirements on measurement performance.16 Firstly, carboxylic acids are weak acids with low ionization efficiency, leading to weak signals in the mass spectra. Secondly, small carboxylic compounds cannot produce characteristic product ions in collision-induced dissociation (CID), which causes difficulty in identification. Thirdly, the mass of urinary carboxylic compounds is always low, yet some mass spectrometers discriminate against small molecules. Hence, in order to improve the ionization efficiency and sensitivity of carboxylic compounds, derivatization reagents are introduced. According to Zhang's study, 32 reagents can be used to derivatize carboxyl of which amine is the most common one.16 Although those derivatization reagents greatly improve the detection results, more sophisticated pretreatments and longer analysis time are also involved in the procedure.17–20
Among all the derivatization reagents, N,N-dimethylethylenediamine (DMED) is one of the most widely used ones in carboxyl-containing metabolite derivatization.21,22 Herein, based on APGD-MS, using DMED, we realized the rapid separation and analysis of carbonyl (mainly carboxylic, also aldehydic and ketonic) compounds in urine. Through machine learning models, we successfully classified HCC and HC cohorts with high precision and sensitivity and obtained 11 potential urinary biomarkers of HCC. Fig. 1 shows the workflow of this cost-effective, time-saving, and pain-free approach to HCC screening.
All experiments were performed in accordance with the guidelines of the Hospital Scientific Research Ethics Committee of the First Affiliated Hospital of Gannan Medical University, and approved by the ethics committee at the First Affiliated Hospital of Gannan Medical University. Informed consent was obtained from human participants of this study.
To simplify this procedure, we employed a DSI platform to activate the derivatization. Since this platform could provide a high temperature of 230 °C, carboxyl (also aldehydic and ketonic) compounds could directly obtain energy and be activated without activating reagents and the alkaline environment. This simplification shortened the analysis time from 4–5 hours to 10 minutes and avoided the usage of exogenous reagents.
For the purpose of balancing signals, cost of time, and economy, the adding volume of DMED and derivatization time were optimized. One hundred microliters of QC sample were mixed with 2.5, 1, 0.5, 0.25, 0.1, or 0.05 μL DMED respectively (made up to 2.5 μL with water) and derived for 1, 5, 10, 30, and 60 minutes under room temperature before APGD-MS analyses. In order to quantify the level of urinary compounds, the mass peak intensity of creatinine (Icreatinine + 2 × Idimer creatinine) was chosen as an internal standard in the optimization experiments because of its superior stability under room temperature.26
As shown in Fig. S5a,† with the prolonging of the derivatization time, the relative peak intensity of DMED showed a downtrend. It can be caused by the continuous derivation of high-level compounds or the self-reaction of DMED (Fig. S5b†). With the adding volume of DMED increased, the relative peak intensity of DMED showed an uptrend, which indicated that excess DMED was not entirely involved in the derivatization. Comparing the 0.25 μL and 0.1 μL lines, the relative peak intensity of DMED remained flat after 10 min, but were both a half lower than the 0.5 μL line. It implied that 0.25 μL of DMED can basically realize the derivatization of low-quantity compounds in 100 μL of urine.
The relative intensities of 41 mass peaks under different derivatization conditions were compared. The level of a quarter of the metabolites was relatively low, like 5-aminosalicylic acid, so the derivatization could be finished with only 0.25 μL of DMED (Fig. S6a†). For other metabolites with higher levels, such as cysteine and proline, the derivatization could be finished with at the most 1 μL of DMED (Fig. S6b and c†). Accordingly, we can surmise that when the added volume of DMED was 1.0 μL, the metabolites could be derived sufficiently. Although the added volume of DMED varied, the relative intensities of most of the derivatives reached a peak in 5–10 min. Therefore, a derivatization time of 10 minutes was considered the optimum.
In conclusion, the derivatization protocol was simplified as follows: 1.0 μL of DMED was added to 100 μL of urine sample. The mixture was vortexed for 30 s and incubated at room temperature for 10 minutes (Fig. S3b†).
In order to compare the quality of the mass spectra acquired by different protocols, the average mass spectra of both methods in 3 minutes are shown in Fig. 2a. The simplified derivatization protocol was able to detect more peaks (Fig. 2b). This result was verified by the high-resolution MS analyses (Tables S2 and S3†).23 It was interesting that among the compounds identified in the general protocol, the number of small molecules was low while large molecules were numerous. We assumed it may be caused by the volatilization of light carbonyl compounds in the evaporation procedure.
Twenty percent of the data sets was randomly picked from each cohort and set as the test set, the other 80% was the training set. Five supervised classifiers RF (Random Forest), LR (Logistic Regression), NN (Neural Network), SVM (Support Vector Machines), and GB (Gradient Boosting) were implanted to classify data sets of two cohorts. Evaluation results were used to measure the classification efficiency.
As shown in Fig. 3a and b, NN gave the best result with an AUC score of 0.990 and classification accuracy of 94.4% for the training cohort. It was chosen as the classifier for the screening of HCC. The evaluation results of NN for the test set are listed in Fig. 3c, and the prediction accuracy was 92.0%. It indicated that combined with machine learning, the analysis of urine carbonyl metabolites acquired by APGD-MS could be a reliable method for HCC screening.
Fig. 3 Screening results of machine learning models. (a) Evaluation results of 5 different models. (b) ROC curves of 5 different models. (c) Confusion matrix of test set. |
A significant difference was defined as fold change (FC) > 2 or FC < 0.5, p-value < 0.05, and variable importance of projection (VIP) value > 1. Under this criteria, 11 compounds can be considered to be the differential metabolites of HCC: acetic acid, creatine, propionic acid, glycolic acid, cyanoacetic acid, nicotinic acid, heptenoic acid, L-pyroglutamic acid, L-ornithine, perillic acid, and N-acetyltaurine. The violin plots of these 11 metabolites are shown in Fig. 4. It can be seen that the level of these 11 acids was generally higher in the HCC cohort, and the most significantly enriched among them was glycolic acid, L-pyroglutamic acid, acetic acid, etc.
In order to explore the latent relationship between urinary carbonyl compounds, the Pearson correlation heatmap and pathway map were obtained based on the relative intensity of 41 peaks (Fig. 5). It can be found that acetaldehyde, cysteine, L-ornithine, etc. were negatively correlated with creatinine, proline, etc. Acetic acid, creatine and glycolic acid presented strong positive correlations with each other. It was inferred that close connections between metabolic pathways of these compounds may exist, yet the specific mechanism needs to be further investigated. In addition, the possible reasons for some altered metabolites were speculated as follows.
Fig. 5 (a) Correlation heatmap of 41 metabolites. (b) Pathway map of the differential metabolites between HCC and HC. |
Liver is a vital place for acetate metabolism, but the diffuse degeneration and necrosis of liver tissue caused by HCC would affect the normal functioning of liver. In cancer cells, the energy supply mode of glycolysis increases while the tricarboxylic acid (TCA) cycle decreases. But acetic acid is metabolized to carbon dioxide and water through TCA. Additionally, acetate is one of the main products of β-oxidation of fatty acid in liver whereas HCC will cause the increase of that and lead to the increased level of acetate.28 Therefore, the acetate metabolism of hepatocytes would reduce but the resultant of acetate would increase so that the excess acetate may be excreted from human body through urine, which leads to the upregulation of urinary acetic acid.
Liver is also the major organ for amino acid metabolism, and liver injury would result in its downregulation. According to the research of Leeda-Arporn, HCC would cause the upregulation of glutamic acid levels in serum.29 Since the changing trend of metabolites in serum is basically simultaneous with that in urine and pyroglutamic acid is the product of glutamic acid after dehydration as well as cyclization, the pyroglutamic acid level in urine would increase.
The metabolic cycle of oxalic acid exists in human hepatocytes: mitochondria metabolize glycine to oxalate, glycolate, and glyoxylate and excrete into the cytoplasm.30 In normal cells, peroxisomes can be involved in the metabolic cycle, which uptake glycolate and glyoxylate, and transform them into oxalate for further use. Nonetheless, peroxisomes are not available for glycolic acid metabolism in HCC cells. This abnormality will cause a high level of glycolic acid and force it to be excreted into urine, resulting in the upregulation of urinary glycolic acid level.
Creatine kinase is the central controller of cellular energy homeostasis, and the mitochondrial isoenzyme of creatine kinase in mitochondria plays a pivotal role in cellular energy metabolism. Overexpression of mitochondrial isoenzyme of creatine kinase has been found in HCC tissue, which makes it a biomarker of HCC.31 It may also accelerate the mutual transformation of creatine and creatine phosphate, present the upregulation of their levels and finally bring on the increase of urinary creatine levels.
Using these 11 potential HCC biomarkers as features, we repeated the machine learning classification (Fig. S11a and b†). It suggested that NN was still the optimal model for HCC screening with an AUC score of 0.936 and classification accuracy of 84.4% for the training cohort. The evaluation results for the test set are listed in Fig. S11c,† and the accuracy was 90.0%. While the classification effect and prediction accuracy slightly vary from using all 41 mass peaks, they were still acceptable. It indicated that these 11 metabolites can be considered to be the potential urinary biomarkers of HCC, realizing the rapid screening of HCC.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2an01756c |
This journal is © The Royal Society of Chemistry 2023 |