João Marcos G.
Barbosa
*a,
Engy
Shokry
a,
Lurian
Caetano David
a,
Naiara Z.
Pereira
a,
Adriana R.
da Silva
b,
Vilma F.
de Oliveira
b,
Maria Clorinda S.
Fioravanti
b,
Paulo H. Jorge
da Cunha
b,
Anselmo E.
de Oliveira
c and
Nelson Roberto
Antoniosi Filho
*a
aLaboratório de Métodos de Extração e Separação, Instituto de Química, Universidade Federal de Goiás (UFG), Campus II – Samambaia, 74690-900, Goiânia, GO, Brazil. E-mail: joaomarcosquim.ufg@outlook.com; nelsonroberto@ufg.br
bHospital Veterinário – Escola de Veterinária e Zootecnia da UFG, Rodovia Goiânia – Nova Veneza, km 8 Campus II – Samambaia, 74690-900, Goiânia, GO, Brazil
cLaboratório de Química Teórica e Computacional, Instituto de Química, Universidade Federal de Goiás (UFG), Campus II – Samambaia, 74690-900, Goiânia, GO, Brazil
First published on 8th September 2023
Cancer is one of the deadliest diseases in humans and dogs. Nevertheless, most tumor types spread faster in canines, and early cancer detection methods are necessary to enhance animal survival. Here, cerumen (earwax) was tested as a source of potential biomarkers for cancer evaluation in dogs. Earwax samples from dogs were collected from tumor-bearing and clinically healthy dogs, followed by Headspace/Gas Chromatography-Mass Spectrometry (HS/GC-MS) analyses and multivariate statistical workflow. An evolutionary-based multivariate algorithm selected 18 out of 128 volatile metabolites as a potential cancer biomarker panel in dogs. The candidate biomarkers showed a full discrimination pattern between tumor-bearing dogs and cancer-free canines with high accuracy in the test dataset: an accuracy of 95.0% (75.1–99.9), and sensitivity and specificity of 100.0% and 92.9%, respectively. In summary, this work raises a new perspective on cancer diagnosis in dogs, being carried out painlessly and non-invasive, facilitating sample collection and periodic application in a veterinary routine.
Most traditional diagnosis methods are invasive and require cytology or histology testing performed directly on the specific suspected type of tumor. However, these approaches may hinder an accurate identification due to the lack of characteristic clinical signs at the earliest stages of cancer growth. Furthermore, despite the current increase in clinical trials for cancer detection in dogs using advanced imaging techniques (computer tomography, magnetic resonance, and positron emission tomography),4 they often present risks due to the inherent risk of excessive exposure of the animal to radiation, the effect of the application of anesthetics and their associated procedures, and the cost-prohibitive nature of the exams.
New omics platform tests have been explored for veterinary purposes to overcome the drawbacks of using conventional diagnostic tools. A recent clinical validation assay applying a next-generation sequencing of blood-derived DNA (liquid biopsy) for multi-cancer detection was used in over 1000 dogs and presented an effective detection rate higher than 85% for the most aggressive canine cancer types.5 Elevated circulating cell-free DNA (cfDNA) levels in dogs with cancer were found to be a potential marker for tumor diagnosis and disease prognosis.6 Moreover, a previous study demonstrated the usefulness of evaluating nucleosome concentrations as a tool in veterinary oncology.7
In addition to genome-based testing, new bioanalytical methods have been developed by screening volatile organic metabolites (VOMs) derived from biological matrices.8,9 VOMs are promising candidates for metabolome-based diagnostics as they can be performed by safe, non-invasive, and specific tests for the early detection of different cancer types.10–12 One primary origin of the volatile biomarkers for cancer may be linked to the reactive oxygen species (ROS) associated with cancer growth and progression.13,14 These oxidative species promote damage in the DNA of cells, endoplasmic reticulum stress, and the oxidation of lipids and proteins, culminating in small metabolites being released into body fluids.15 Although the potential of VOMs to be used in cancer detection in humans has been explored, there is a lack of volatile compound-based methods for cancer detection in dogs. Therefore, to facilitate the exploration of cancer-related VOMs for diagnostic purposes in dogs, it is critical to assess their chemical attributes and diversity.
Cerumen, or earwax, is a biomatrix that seems to fit these criteria. It refers to secretions of the sebaceous and sweat glands containing polar and non-polar substances, mainly metabolic lipid-derived components.16,17 Recently, cerumen VOMs have been successfully applied in diabetes diagnosis,18,19 forensic applications,20–24 toxicological monitoring purposes,25–29 identification of rare otolaryngological disorder (Ménière's disease),30 and human cancer diagnosis.31 The chemical fingerprint of VOMs for use in disease detection is known as Cerumenogram.
In this sense, this work proposes the use of cerumen as a source for valuable biomarker prospection in cancer identification in dogs. It is essential to emphasize that, as far as the authors are aware, this study spotlights the first application of the canine cerumen volatilome for cancer detection purposes, using a non-invasive biomatrix and a qualitative method performed without sample preparation steps by Headspace/Gas Chromatography-Mass Spectrometry (HS/GC-MS) analysis and multivariate statistical approaches.
After clinical confirmation, one hundred earwax samples were obtained from 50 cancer-free dogs (control group) and 50 tumor-bearing dogs (case group). The case group was heterogenous and included 31 samples from female dogs with breast cancer (BC), 7 samples (six male and one female) from dogs with skin and soft tissue cancer (STC), 2 samples from two male dogs with colorectal cancer (CC), 6 samples for six male dogs with testicular and prostate cancer (TPC), and 4 samples from four male dogs with transmissible venereal tumor (TVT). Although the TVT are clonally transmitted rather than caused by mutation, which turns this cancer type generally to be considered a different tumor outcome than the rest of the others included, we kept these samples to observe the similarities of the earwax chemical profile to the other cancer types. Also, among the case group, 23 samples were collected from animals undergoing chemotherapy. Then, all the chemotherapeutics and other medications used during the cancer treatment were analyzed by HS/GC-MS to evaluate and exclude the presence of any potential xenobiotic and its derivative in earwax profiles, and its effect on the cerumen chemical profile was tested before any biomarker selection steps. Detailed demographic characteristics of the canine groups and the specific diagnosis for the cancer group are available in Table S1 in the ESI.†
The molecular features detected were subtracted from the QC blank runs. Those corresponding to a chromatographic peak with an area/height (A/H) ratio higher or equivalent than three were pointed out as putative metabolites. In total, 128 peaks were selected corresponding to VOMs. To perform statistical analysis, the 128 VOMs were transformed into a binary dataset output, where “1” indicates the compound presence (peak area > 0 and A/H ≥ 3) and “0” is its absence (peak area = 0 or A/H < 3). Then, aiming to evaluate which variables/VOMs might discriminate between the control and cancer group, a binary dataset was built containing 100 samples (50 cancer-free dogs + 50 tumor-bearing dogs) by the 128 VOMs.
Multivariate statistical analyses employed for variable selection and visualization were carried out using R version 4.1.2 and the online platform Metaboanalyst 5.0 (https://metaboanalyst.ca/).33 A biological evolutionary-based approach using a genetic algorithm with partial least squares (GA-PLS)34 was applied to optimize the discriminant VOM selection in the canine dataset. The parameters used for GA-PLS analysis were: a population size of 100, a window width of 1, the maximum number of variables in each population of 100, a convergence probability of 50%, a mutation probability of 0.5%, and the maximum number of generations of 35 with contiguous cross-validation. Multiple GA-PLS models were run to select the best discriminant variables/VOM panel and avoid overfitting. Each time, the performance of the chromosome (PLS of the chosen variables/VOMs) was evaluated using k-fold cross-validation – k ∈ {10,10} – as a random resample technique. Then, the optimal tuning parameters of each panel of discriminant variables/VOMs were assessed under 100 different sets of the training set (80% of the samples: n = 80), using receiver operating characteristic (ROC) as an efficacy metric for the model. Next, the test/validation set (20% of the remaining samples: n = 20) was used to fit the model. The best panel of discriminant VOMs was based on the diagnostic measurement results of the test set: accuracy, sensitivity, specificity, control and cancer predictive ability, and kappa value.
Hierarchical cluster analysis (HCA) was run applying Hamming distances and the Ward agglomeration method to visualize the distance measures of the best chromosome of the GA-PLS. Principal coordinate analysis (PCoA) was run using Hamming distance to visualize the samples across the multidimensional space in two principal coordinates (PCO1 and PCO2). Venn diagrams were built using the web-based “InteractiVenn (interactivenn.net)” tool.35 The following R packages were used: e1071 (v1.7-9),36 dendextend (v1.15.2),37 plsVarSel (v0.9.7),38 ade4 (1.7-18),39 caret (v6.0-92),40 and ggplot2 (v3.3.5).41
In this study, the cerumen analysis of dogs led to the annotation of 128 VOMs, covering a broad spectrum of chemical diversity from polar to non-polar metabolites. The cerumen VOMs are described in Table S2 (ESI†) with their respective elution order, IUPAC name, target peak (m/z), the level of identification, canonical simplified molecular input line entry system (SMILES), CAS number, and the occurrence (%) in each group of samples. Typical total ion chromatogram (TIC) fingerprints of dog cerumen analysis are shown in Fig. S1 (ESI†). In sum, the deconvolution of the TICs led to the identification of 22 alcohols, 19 aldehydes, 18 ester/ethers, 17 ketones, 14 carboxylic acids, 13 amine/amide derivatives, 12 hydrocarbons, 7 furan and lactones, 4 pyran metabolites, and 2 pyrrole compounds.
An abundance of organic families in cerumen composition resembles those also detected in the skin metabolome,8 which may be associated with sebum material in both biomatrices.17,31,49 Sebum is an oily substance produced by the sebaceous glands; when it is broken down upon exposure to ROS, it gives rise to many volatile organic compounds from different organic classes, such as aldehydes, ketones, and hydrocarbons,8 also identified herein with a high frequency. Nonetheless, despite the skin and cerumen chemical advantages regarding the vast range of polarity, it might be preferable to work with cerumen rather than skin secretions due to less liability to external contamination, such as cosmetic products, and due to less exposure to ultraviolet (UV) radiation and air pollution. Moreover, the earwax volatile profile associated with the canine volatilome information from other biofluids may be a valuable resource to elaborate a compendium of VOMs detected in the canine organism, as already performed for humans.50
To look for possible demographic bias in the dog cerumenomic data, cluster analyses were run for factors such as age and sex before applying any variable selection procedure using the complete set of 128 VOMs. As done for humans, a binary data approach was used to reduce the noise of demographic factor impact on the earwax chemical profile.31 Dendrograms in Fig. S2 and S3 (ESI†) present no discrimination pattern for sex and age, respectively.
On the other hand, Fig. S4 (ESI†) indicates that cancer samples present, in general, a different VOM pattern (right-hand branch of the dendrogram) when compared to the control samples (left-hand branch). The clusters in Fig. S4 (ESI†) indicate that this discrimination can be improved using a variable selection procedure. Furthermore, as shown in Fig. S5 (ESI†), no pattern of discrimination between dogs under chemotherapy and cancer dogs without therapy was found.
Ten GA-PLS models were built to select the best panel of candidate biomarkers. In sum, 65 out of 128 VOMs distributed across ten GA chromosomes were used to build panels of potential cancer biomarkers. In addition, we created two more chromosomes with a cut-off of 0.5 and 0.8 of frequency of variable selection. In summary, the chromosomes from I to X held the variables randomly selected by GA, whilst XI and XII contained the most frequently chosen variables during the ten repeated times (5 out of 10 times and 8 out of 10 times, respectively), as shown in Fig. S6 (ESI†). Table S3 (ESI†) presents a summary of the performance of each candidate model using training and test sets. Fig. S7 (ESI†) presents ROC plots for the optimal tuning parameter results (measured using repeated k-fold cross-validation) for the 12 evaluated chromosomes.
As shown in Table S3 (ESI†), chromosomes I, V, and VII presented similar performances with 95% and 90% in the test dataset regarding the accuracy and kappa value, respectively. Between the three chromosomes, two VOMs are shared (Fig. S8, ESI†). Even though all three chromosomes exhibited similar performances, chromosome I was selected as the best panel of candidate biomarkers for the following reasons: (i) chromosome I presents the most straightforward model, using only 1 PLS component compared to the 3 used by chromosomes V and VII, which makes it a less complex model and less prone to overfitting; (ii) contrary to chromosomes V and VII, the performance of chromosome I in the training set reached the highest values for all metrics used to measure the effectiveness of a diagnostic test (ROC = 100.0%, accuracy = 100.0%, sensitivity and specificity = 100.0%, kappa value = 100.0%, healthy and cancer predictive value = 100.0%); and iii) the candidate biomarkers selected by chromosomes V and VII did not show a full discrimination pattern between samples for cancer and control groups, as shown in Fig. S9 (ESI†). Thus, the 18 VOMs in chromosome I were selected as the best subset of potential cancer biomarkers in dogs. Fig. S10 (ESI†) shows the panel of 18 volatile biomarkers, including their chemical structures and occurrences in cancer and control groups.
Among the 18 VOMs, 3-methylbutanal (VOM 11, Table S2, ESI†), 2-furanmethanol (VOM 23), octanal (VOM 40), 3-decenoic acid (VOM 58), 1,1-dibutoxyhexadecane (VOM 64), 1-dodecanol (VOM 75), hexadecane (VOM 79), octadecane (VOM 88), n-nonadecanoic acid (VOM 108), methyl palmitate (VOM 115), methyl stearate (VOM 121), and heneicosane (VOM 126) presents a higher frequency of occurrence in cancer samples. On the other hand, 2-hydroperoxypentane (VOM 7), 2-methylfuran (VOM 9), 2-dodecanone (VOM 78), 2-tetradecanone (VOM 93), 5-dodecenyl acetate (VOM 94), and butyl stearate (VOM 127) appears more frequently in the control group.
Fig. 1 presents the circular dendrogram run using the 18 VOMs selected as potential cancer biomarkers in the earwax of dogs. This dendrogram shows a clear discrimination pattern between control and cancer samples. On the other hand, over-classification bias driven by the sex, age, or cancer therapy of the dogs was not found since no pattern of discrimination can be noticed in Fig. S11–S15 (ESI†). Moreover, although it is not within the scope of the paper, we monitored if any clusters could be observed due to the cancer type in the cerumenogram model. However, as shown in Fig. S16 (ESI†), no apparent trend was noticed.
Thus, these 18 VOMs selected by GA-PLS arise here as promising potential cancer biomarkers in dogs with the following diagnostic metric figures in the test set of: 95.0% (75.1–99.9, CI = 95%, accuracy), 100.0% (sensitivity), 92.9% (specificity), 90.0% (kappa value), 90.9% (healthy predictive value), and 100.0% (cancer predictive value).
As shown in the Heatmap (Fig. 2a), when observing the matrix correlation for the samples regarding the 18 potential cancer biomarkers, there is a high correlation (>0.5, Spearman rank correlation test) within the samples from the same group (control or cancer), contrasting with a poor correlation of the samples from different groups (control × cancer). Another pattern observed in the Heatmap is the high correlation of VOMs with the same trend of occurrence across the groups. For example, as observed in Fig. 2b, there is a high correlation of VOMs 126, 121, 40, and 79 (heneicosane, methyl stearate, octanal, and hexadecane, respectively), which are related to their higher presence in cancer samples compared to control ones (Fig. S10, ESI†). Similarly, VOMs 7 and 9 (2-hydroperoxypentane and 2-methylfuran, respectively) are highly correlated, appearing more frequently in the control group (Fig. S10, ESI†).
Aiming to test the discrimination power of the selected biomarkers, 18 dogs from the cancer group (n = 9 BC, n = 4 STC, n = 1 TPC, n = 1 CC, and n = 3 TVT) were chosen to obtain earwax samples longitudinally during their routine visits to the veterinary hospital for cancer treatment purposes, totalizing in 33 new earwax samples (Table S4, ESI†). These new samples were added to the healthy/cancer VOM discrimination set without labeling the samples as Y (cancer) or N (cancer-free, control) after tracking the presence/absence of the 18 earwax VOMs. A multivariate ROC (MultiROC) curve-based model function predicted the class for 33 new samples. Table S4 (ESI†) shows that all samples were correctly classified in the Y (cancer) class. Fig. S17 (ESI†) shows the PCoA plot with the longitudinal cancer samples, evidencing their similarities with the cancer group.
Moreover, of the aldehydes selected as potential biomarkers for cancer in dogs, 3-methylbutanal (VOM 11) and octanal (VOM 40) have been shown to be side-products of lipid peroxidation.8 Cerumen analysis indicates a slightly higher frequency of the biomarker 3-methylbutanal in samples for cancer dogs compared to cancer-free dogs (≈10%, Table S2, ESI†). Interestingly, a previous study showed the same pattern of occurrence for this metabolite in the human urinary signature of breast cancer patients compared to healthy patients.59 Nevertheless, between the two selected aldehydes as biomarkers, octanal is the one that has been previously described as a cancer biomarker in the blood plasma of dogs.48 In this work, octanal was only detected in earwax samples of dogs from the cancer group. Furthermore, in a previous study, this metabolite was selected as a crucial discriminant metabolite in the hair of canines to identify visceral leishmaniasis,60 which may indicate that this aldehyde is an important marker for metabolic disturbances in dogs.
In total, the evolutionary algorithm indicated six compounds putatively annotated from ester and organic ether classes as candidate markers: 2-hydroperoxypentane (VOM 7), 1,1-dibutoxyhexadecane (VOM 64), 5-dodecenyl acetate (VOM 94), methyl palmitate (VOM 115), methyl stearate (VOM 121), and butyl stearate (VOM 127). Compounds from these organic families are derived from lipid metabolism,61 which is a route that cancer cells use to compensate for the Warburg effect and produce the energy necessary for tumor cell proliferation.62 Among these metabolites, methyl palmitate and methyl stearate were identified as volatile markers of adipogenic differentiation in mesenchymal stromal cells.63 Furthermore, methyl stearate has been detected as a breast cancer biomarker in human blood serum64 and as an important compound in a panel of biomarkers identified in exhaled breath for discriminating between lung cancer patients and those with high-risk factors.65
Fatty acids, also metabolites widely associated with cancer lipid metabolism, play an essential role in the mutation of tumor cells to ensure their growth, proliferation, and survival.62 Found in this work were two compounds that may be potential cancer biomarkers: 3-decenoic acid (VOM 58) and n-nonadecanoic acid (VOM 108), both presenting higher occurrence in cancer samples (Fig. S10, ESI†). Previous studies employing a metabolomic approach for detecting cancer in dogs have revealed higher fatty acid levels in the blood of dogs with oral melanoma47 and lymphoma,46 indicating that compounds in this class may be useful markers for cancer identification in dogs. Also, a previous study has demonstrated that evaluating fatty acid levels in cerumen could be a useful clinical tool for rapidly and accurately detecting Ménière's disease.30
During tumor progression, the higher rate of fatty acid oxidation - associated with negative protein and energy - results in cancer cachexia and is characterized by the involuntary loss of the patient's lean body mass and the release of ketone bodies.66 Ketone bodies have been previously identified as a potential indicator of cancer in dogs.43,48 Here, GA-PLS selected the ketones 2-dodecanone (VOM 78) and 2-tetradecanone (VOM 93) as members of the panel of canine cancer biomarkers in cerumen. Remarkably, these two ketones were previously identified as earwax biomarkers for other metabolic disturbances. For example, increased levels of 2-dodecanone were found in diabetic type 1 patients,18 and higher concentrations of both ketones have been observed in mammals during pre and post-parturition periods.16
The oxidative stress associated with the peroxidation of polyunsaturated fatty acids in cellular and subcellular membrane levels can indicate cancer initiation, and it is the primary mechanism of hydrocarbon release in the body of mammals.66 In this study, three long straight-chain saturated hydrocarbons were designated as probable cancer biomarkers in dogs: hexadecane (VOM 79), octadecane (VOM 88), and heneicosane (VOM 126). Among them, hexadecane has been described as part of the volatile signature for bladder cancer cell lines,67 lung cancer tissue,68 and in the breath samples of ovarian cancer patients where it appears at higher concentrations.69 Also, hexadecane and octadecane increase their levels in exhaled breath of gastric cancer patients.70 Heneicosane is a common metabolite detected in the human body biomatrices, such as skin, saliva, and cerumen.50,71 Here, it is described for the first time as having a higher occurrence in cancer dogs (78.0%) compared to control dogs (2.0%) (Fig. S10 and Table S2, ESI†), being the most prominent variable in the GA-PLS model (Table S3, ESI†).
Endogenous furan derivative formation in mammals may be associated with the natural dehydration of monosaccharides and the oxidation of fatty acids catalyzed by lipoxygenases.72 The two furan compounds detected as biomarkers, 2-methylfuran (VOM 9) and 2-furanmethanol (VOM 23), have already been detected when cancer is present. The metabolite 2-methylfuran has been described as part of the volatilomic footprint of human gastric cancer cell lines73 and in the urine of tumor-bearing mice.74 In addition, both metabolites, 2-methylfuran and 2-furanmethanol, were detected in the urinary profile of human breast cancer patients.59
As noted, lipid metabolism for energy purposes and the oxidative stress associated with the tumor/cancer cell microenvironment are the leading causes of the appearance of these candidate cancer biomarkers in the cerumen of dogs. Nevertheless, many hypotheses explaining the exact origin of the volatile biomarkers have been proposed so far, and further efforts exploring the volatile fingerprint of many cancer cell lines must be conducted to elucidate the cancer volatilome and eliminate the confounding effects associated with clinical samples.66 Some of these hypotheses are: (i) the volatile biomarkers arise due to pathway over-activation in cancer, e.g., glycolysis, fatty acid biosynthesis, mitochondrial β-oxidation of long-chain saturated fatty acids, etc.; (ii) the volatile metabolites are linked to the immune system rather than the tumor environment; (iii) many of the volatile biomarkers arise due to patient exposure to the environment (exposome); and (iv) cancer VOMs emerge in the organism as a consequence of cancer stem cells and their high levels of aldehyde dehydrogenase (ALDH).66 Although these hypotheses are yet to be tested, the origin of the potential biomarkers identified in this study implies that the first hypothesis is the most likely explanation for the metabolic differences noted when cancer is present and which also emphasizes the potential utility of volatile biomarkers in cancer diagnosis.
In summary, this study sheds light on the perspective that the discriminating cancer-related VOMs in cerumen may be linked to lipid metabolism and oxidative stress, which has the potential to indicate mitochondrial dysfunction during cancer growth and progression. Studies involving other animals, as well as the footprint and fingerprint of a wide variety of cancer cell lines, may be helpful in the search for common biomarkers to elucidate and confirm the oncopathways, which can guide the development of new chemotherapeutic approaches and diagnostic kits focusing on a specific targeted group of molecules.
Footnote |
† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d3mo00147d |
This journal is © The Royal Society of Chemistry 2024 |