Metabolic alterations in dairy cattle with lameness revealed by untargeted metabolomics of dried milk spots using direct infusion-tandem mass spectrometry and the triangulation of multiple machine learning models

Wenshi He; Ana S. Cardoso; Robert M. Hyde; Martin J. Green; David J. Scurr; Rian L. Griffiths; Laura V. Randall; Dong-Hyun Kim

doi:10.1039/D2AN01520J

View PDF VersionPrevious ArticleNext Article

Open Access Article

This Open Access Article is licensed under a
Creative Commons Attribution 3.0 Unported Licence

DOI: 10.1039/D2AN01520J (Paper) Analyst, 2022, 147, 5537-5545

Metabolic alterations in dairy cattle with lameness revealed by untargeted metabolomics of dried milk spots using direct infusion-tandem mass spectrometry and the triangulation of multiple machine learning models†

Wenshi He ^a, Ana S. Cardoso ^b, Robert M. Hyde ^b, Martin J. Green ^b, David J. Scurr ^a, Rian L. Griffiths ^a, Laura V. Randall *^b and Dong-Hyun Kim *^a
^aCentre for Analytical Bioscience, Advanced Materials & Healthcare Technologies Division, School of Pharmacy, University of Nottingham, Nottingham NG7 2RD, UK. E-mail: dong-hyun.kim@nottingham.ac.uk
^bSchool of Veterinary Medicine and Science, University of Nottingham, Sutton Bonington Campus, Leicestershire, LE12 5RD, UK

Received 15th September 2022 , Accepted 18th October 2022

First published on 26th October 2022

Abstract

Lameness is a major challenge in the dairy cattle industry in terms of animal welfare and economic implications. Better understanding of metabolic alteration associated with lameness could lead to early diagnosis and effective treatment, there-fore reducing its prevalence. To determine whether metabolic signatures associated with lameness could be discovered with untargeted metabolomics, we developed a novel workflow using direct infusion-tandem mass spectrometry to rapidly analyse (2 min per sample) dried milk spots (DMS) that were stored on commercially available Whatman® FTA® DMPK cards for a prolonged period (8 and 16 days). An orthogonal partial least squares-discriminant analysis (OPLS-DA) method validated by triangulation of multiple machine learning (ML) models and stability selection was employed to reliably identify important discriminative metabolites. With this approach, we were able to differentiate between lame and healthy cows based on a set of lipid molecules and several small metabolites. Among the discriminative molecules, we identified phosphatidylglycerol (PG 35:4) as the strongest and most sensitive lameness indicator based on stability selection. Overall, this untargeted metabolomics workflow is found to be a fast, robust, and discriminating method for determining lameness in DMS samples. The DMS cards can be potentially used as a convenient and cost-effective sample matrix for larger scale research and future routine screening for lameness.

Introduction

Lameness is a major health issue of dairy cows. It impairs sustainability due to animal health and welfare impacts, and therefore has economic and ethical implications.¹ Despite the recent efforts by the dairy industry to reduce lameness levels, a recent study showed that the average prevalence in the UK is as high as 30.1%.^2,3 Early detection and timely treatment are essential to mitigate the impacts of lameness.⁴ The current mainstream method of diagnosis is mobility scoring based on visual assessment of gait by trained observers.⁵ However, pain experienced by lame cows is often masked by their instinctive stoicism, which makes it difficult to diagnose the disease before the appearance of clinical signs.⁶ Another major limitation of this method is intra- and inter-observer variability.⁷ Other authors have reported that pro-inflammatory cytokines and acute-phase proteins (APPs) can be used as biomarkers.^8,9 However, because of the high cost of ELISA tests required for detecting these immune-related molecules, alternative rapid and robust approaches are urgently required for routine screening.

Metabolomics has become an increasingly popular “omics” approach to biomarker discovery.¹⁰ State-of-the-art metabolomics techniques allow the detection of hundreds to thousands of metabolites with a minimal amount of sample.¹¹ It is believed that metabolomics can deliver remarkable achievement in livestock research due to its capability of fast, effective and quantitative metabolic phenotyping.¹² However, few studies have been reported regarding metabolic alteration associated with lameness. Zheng et al. utilised nuclear-magnetic-resonance (NMR)-based metabolomics to investigate metabolic difference between healthy and cows with footrot from blood samples.¹³ Dervishi et al. used gas chromatography-mass spectrometry (GC-MS) to investigate the metabolic signatures from serum samples of lame cows during different stages of lameness development.¹⁴ Eckel et al. showed that metabolic alterations during disease development could be identified from cow's urine using liquid chromatography (LC)-MS.¹⁵ However, metabolic alterations in lame cows have not yet been investigated using milk, which is a desired source as it is easily accessible and can be collected in a non-invasive manner.

In real-world settings, farmers may face logistical challenges sending samples from farms to laboratories for lameness diagnosis using metabolomics techniques. This arises from the need for temperature regulations (usually at −80 °C) during storage and transportation of conventional liquid biological samples. Hence, dried matrix (i.e., blood, urine, milk) spots on paper is a more attractive sample type because of low sample volume required, ease of collection, room–temperature storage, and low–cost postal shipping. Although dried blood spots (DBSs) have been used in a wide range of research including metabolomics studies,¹⁶ few studies explored the use of different dried milk spots systems,^17–19 and none for veterinary or agricultural applications. These established dried milk spot (DMS) systems studied human breast milk and often require pre-treatment of papers using different protocols which can introduce extra inconsistencies between studies. Here, we propose that the commercially available Whatman® FTA® DMPK cards, which is originally designed for DBSs, can be potentially used as a simpler way of collecting, preserving, and storing bovine milk samples for metabolomics research.

In metabolomics research, popular statistical approaches of identifying metabolic differences between classes are multivariate analysis techniques such as orthogonal partial least squares (OPLS) and univariate analysis (e.g. Student's t-test^20–23). However, these “conventional” methods have innate limitations, especially when handling complex metabolomic data. Firstly, OPLS tends to construct prediction models that remove systematic variation that does not agree with the assigned group classification, therefore, force scores-space separation.²⁴ Without rigorous validation, the “significant” results and “important” variables could be generated by the model solely by chance. Secondly, for Student's t-test, the idea of hypothesising “there is a difference” based on the concept of statistical significance and p values has been increasingly criticised, as it provides fairly limited information about the data, and can be easily misinterpreted.²⁵ The triangulation of multiple machine learning methods can yield valuable insights on the reliability of the results generated from the statistical workflow described above. It can also mitigate the issue of results being method-dependent and improve the likelihood of identifying truly important variables.²⁶ Furthermore, since covariate selection using conventional regression approaches often have high variability and relatively low reproducibility, stability selection could be incorporated into prediction models.^27–30 This strategy can help identify the most stable predicators under resampling that are likely to be the strongest candidates as disease indicators among significant metabolites.

Here, we investigated the metabolic alterations in lame cows compared with non-lame cows and assess the suitability of Whatman® cards as a DMS media by using a direct infusion method with TriVersa NanoMate sampling system coupled to high-resolution MS. This direct infusion method allows high-speed analysis (2 min per sample) in ambient environment.³¹ This feature allows rapid screening for potential biomarkers which may also make it possible to conduct large-scale research and routine lameness testing for dairy cows in the future. Furthermore, with the strategy of using triangulation of multiple statistical models, we were able to identify potential disease predictors.

Experimental

DMS sample preparation

Milk drops were collected directly onto Whatman® FTA® DMPK cards (Fig. 1) from 10 lame cows and 11 healthy cows from one dairy farm based at the Centre for Dairy Science Innovation (CDSI), University of Nottingham. It was a research dairy herd containing 300 cows that produce milk commercially. Cows were housed continuously with sand bedded cubicles and slatted flooring. Lame and healthy control cows were identified based on visual assessment using the Agriculture and Horticulture Development Board (AHDB) scoring system (0 to 3) where lame was defined as score ≥2 and healthy (non-lame) defined as score <2.³² Each spot on the FTA® DMPK cards contained one drop of milk (∼20 μL). The DMS cards were air-dried, then stored in plastic seal bags at room temperature. After 8 days, part of each spot was removed from the cards into 1.5 mL Eppendorf tubes (Eppendorf AG, Hamburg, Germany) using a 6 mm hole puncher. Extraction of metabolites from each sample was conducted with a 500 μL mixture of 70% v/v methanol (VWR, West Sussex, UK) and 30% v/v water to which MS-grade formic acid (Optima LC−MS grade; Fisher Scientific, Loughborough, UK) was added (final concentration, 0.1% v/v). Deionised water was prepared using a Milli-Q water purification system (Millipore, MA, USA). After mixing and incubating in the extraction solvent for 20 min, the samples were centrifuged (MiniSpin®, Eppendorf AG, Hamburg, Germany) for 10 min at 6708g. Then, 200 μL supernatants were transferred to clean Eppendorf tubes. To dilute the extracted metabolites, a further 800 μL extraction solvent was added to each sample. The procedure was adopted from a metabolite extraction method using dried blood spots by Trifonova et al.³³ To assess the sample stability during a prolonged storage time at room temperature, the same metabolite extraction procedure was repeated on day 16 using adjacent milk spots.


	Fig. 1 Example of dried milk spots on a Whatman® FTA® DMPK card.

Mass spectrometry analysis

The solvents containing extracted metabolites were transferred into a 96-well plate, then 10 μL were directly infused into a high-resolution Q-Exactive plus Orbitrap spectrometer (Thermo Fisher Scientific, Hemel Hempstead, UK) via chip-based nanoelectrospray ionisation (Advion Biosciences, Ithaca, NY) at 1.5 kV and 0.6 psi gas pressure. Data was acquired for 1 min for each polarity using a scan range of m/z 70–1050. In full MS mode, the resolution was set to 140 [thin space (1/6-em)]

000 at m/z 200, and the AGC target was set to 3 × 10⁶ with a maximum ion injection time of 200 ms. The top 10 most intense ions were isolated within a 0.5 m/z window for data-dependent acquisition (DDA) at a resolution of 17 [thin space (1/6-em)]

500, AGC target of 1 × 10⁶ and a maximum ion injection time of 50 ms. For data-independent acquisition (DIA), the pre-selected ions were isolated within a 0.4 m/z window at a resolution of 35 [thin space (1/6-em)]

000 and AGC target of 2 × 10⁵. Stepped normalized collision energy (NCE) of 20, 30 and 40 was applied in both DDA and DIA. The pooled QC samples were analysed intermittently for the duration of the MS analysis.

Peak picking and alignment

The .RAW data files from Xcalibur were converted to .mzXML format using ProteoWizard.³⁴ Peaks with intensities above 100 [thin space (1/6-em)]

000 were picked and aligned within a 5 ppm m/z window using an in-house MATLAB (R2020a, The MathWorks, Inc., Natick, MA) script.¹¹ Features with more than 20% missing values across all samples were removed. The remaining missing values were imputed using k-nearest neighbour (knn) imputation (k = 10).³⁵ Individual ion intensity matrices from both polarities were concatenated using a low-level data fusion strategy.³⁶

Multivariate and univariate analysis

Following the feature extraction workflow, the data were normalised to total ion count, log-transformed and Pareto scaled.³⁷ Principal component analysis (PCA) and orthogonal partial least squares discriminant analysis (OPLS-DA) models were constructed using SIMCA 16 software (Umetrics, Sweden). Selection of discriminative variables was based on a variable's importance in projection (VIP) score > 1 in OPLS-DA models. The models were validated using the built-in function of leave-one-out cross validation (LOOCV) procedure and permutation test. Student's t-test with controlled false discovery rate (FDR) (q < 0.05) was performed in MetaboAnalyst 5.0.^38,39 A p-value < 0.05 was considered significant.

Machine learning and stability selection

Following the feature extraction workflow, the data were normalised to total ion count and standardised. Four common supervised machine learning (ML) techniques were performed in R,⁴⁰ including random forest (RF),⁴¹ elastic net,⁴² partial least squares (PLS),⁴³ and support vector machine (SVM).⁴⁴ Prediction accuracy of each model was assessed using LOOCV: for each ML algorithm, 1 cow was chosen as test set and the remaining cows were used as training set. This procedure was repeated 20 times for each model. Recursive feature elimination (RFE) was used for all algorithms to identify the smallest set of metabolites that provided maximum predictive accuracy; this was conducted external to the LOOCV procedure to ensure no selection bias occurred.⁴⁵

Stability selection was performed using the stabiliser package.⁴⁶ Three penalised models: elastic net,⁴² minimax convex penalty (MCP)⁴⁷ and least absolute shrinkage and selection operator regression (Lasso)⁴⁸ were constructed. Selection stability was evaluated for each model as the percentage of times that each variable was selected across 500 bootstrap samples.⁴⁹ To estimate the stability threshold, the outcome was permuted 20 times to generate 20 new datasets in which the relationship between the outcome and observations were severed. The threshold was determined by the highest stability score achieved in the permutated datasets over 50 bootstrap samples across each of the 20 permuted datasets.⁵⁰ A bootstrap p-value was defined as the proportion of coefficient estimates on the minority side of zero. For example, if a variable was selected on 100 occasions and the coefficients on 95 occasions were greater than 0, then the bootstrap p-value would be (100 − 95)/100 = 0.05.

Metabolite identification

The ion masses of important variables were searched against the Bovine Metabolome Database⁵¹ with 5 ppm mass tolerance and Lipid Maps⁵² with ±0.001 m/z tolerance using [M + H]⁺, [M + Na]⁺, [M + K]⁺, and [M + H − H₂O]⁺ as adducts for positive mode, also [M – H]⁻ and [M − H − H₂O]⁻ for negative mode. For lipid search, multiply charged adducts were also included. For structure-based identification, MS/MS spectra were matched with the experimental reference spectra from the same normalised collision energy using mzCloud by comparing the fragmentation patterns and the accurate mass of the fragments. For compounds that were not recorded in mzCloud database, accurately predict ESI-MS/MS spectra generated by CFM-ID program were used for improving confidence for identification.⁵³

Results and discussion

Rapid metabolic profiling workflow for the investigation of dairy cow lameness

Commercially available Whatman FTA® DMPK cards were used to collect milk drops directly from dairy cows (Fig. 2A and B). To evaluate analytes stability during prolonged storage periods, the dried milk spot (DMS) cards were stored in plastic seal bags at room temperature for 8 and 16 days, respectively, until metabolite extraction (Fig. 2C). The extracted samples were directly infused into a high-resolution mass spectrometer for rapid metabolomic analyses (2 min per sample) using a robotic sampling system (Fig. 2D). The idea behind this workflow was to explore the use of DMS sample matrix for easy sample collection and low-cost postal delivery from farms to analytical laboratories for rapid lameness diagnosis. This could potentially be an attractive option as it omits the needs for temperature regulations during storage and transportation for conventional liquid samples. It also eliminates the inter- and intra-person variability in comparison to the current diagnostic approach. For laboratories, the simple sample preparation procedure and rapid analysis with direct infusion system could allow high throughput for large-scale veterinary clinical research. For data analysis, we added machine learning and stability test to the conventional workflow of OPLS-DA and t- test to mitigate the issue of results being method-dependent and identify the most stable predictors (Fig. 2E)


	Fig. 2 Rapid analysis of dried milk spot cards with direct infusion mass spectrometry. (A) Milk drops were collected directly from cows onto commercially available Whatman FTA® DMPK Cards. (B) Milk spots were air-dried, then stored in plastic seal bags at room temperature for 8 and 16 days. (C) Spot cards were punched for metabolite extraction. (D) The extraction solvent containing milk metabolites was transferred into a 96-well plate, then delivered to mass spectrometer using the automated sampling system, TriVersa NanoMate LESA®. (E) Multivariate and univariate analysis were first carried out for class predication and identification of lameness-related metabolites. The results were further validated using machine learning and stability approach.

Metabolic profiles of DMS differentiate lame and control cows

The milk metabolic profiles acquired by direct infusion MS were used to discriminate the four sample groups (day 8 extracts – control/lame, day 16 extracts – control/lame). Features in positive and negative ion modes were combined and used to construct a PCA model (number of components A = 6, number of observations N = 48) (Fig. 3A),⁵⁴ in which all DMS extracted on day 8 and day 16 since sample collection are clearly distinguished in the first principal component (PC 1) (x-axis). In the PCA plot, the pooled QC samples located in the middle of all analysed samples from the same extraction day, which indicated good reproducibility during the analytical run. The PC 1 loadings plot revealed an overall reduction in signals from day 16 samples compared to day 8 (Fig. 3B).


	Fig. 3 Multivariate analysis results. (A) PCA of dried milk spots extracted on day 8 and day 16 after sample collection. Pooled QC samples showed stable analytical performance. (B) Loadings of PCA principal component 1 and 2. (C) OPLS-DA scores plot reveals clustering of cows based on health conditions (control vs. lame) (R²X 0.321 R²Y 0.899 Q² 0.601) from milk metabolites extracted on day 8. (D) S plot shows ions that have strong correlation with the cow health conditions (orange).

An OPLS-DA model was built (A = 1 predictive component +1 orthogonal component, N = 21) to compare the healthy group and lame group from day 8 extracted metabolites (Fig. 3C). The model was validated using LOOCV method. Clear grouping of the two classes was observed (Q²: 0.601). In general, Q² (goodness-of-prediction) > 0.5 is considered as good predictability,⁵⁵ and 0.4 may also be considered acceptable for biological models.^11,23 To mitigate the issues with potential overfitting and over estimation of Q², we further conducted a permutation test which confirmed the validity of the constructed model (Fig. S1†). The associated S-plot enabled the determination of the most important ions for distinguishing the control and lame cows (Fig. 3D).⁵⁶ The discriminative ions (highlighted in orange colour) were determined by a VIP score > 1 in OPLS-DA and a p value < 0.05 in multiple t-test (FDR corrected, q < 0.05). Ten out of 12 discriminative ions were assigned putative molecular formulae (Table 1). To further confirm the identities of these discriminative ions, both experimental and computed MS/MS spectra were used for structure-based identification (Fig. S3–S8†).

Table 1 Annotation for discriminative ions (VIP > 1, p-value < 0.05, FDR corrected) of healthy and lame cow groups (day 8). VIP: variable importance in the projection. FDR: false discovery rate

m/z	Adduct	Assignment	Mass error/ppm	Monoisotopic mass (Da)	Identification method
343.995	Unknown	Unknown	Unknown	Unknown	Unknown
315.0416	Unknown	Unknown	Unknown	Unknown	Unknown
267.1968	[M − H₂O − H]⁻	Hexadecanedioic acid	3.0	286.2144	m/z
401.2358	[M + 2Na]²⁺	PG 35:4	1.2	400.2283	m/z, computed MS/MS
317.1149	[M + K]⁺	Alpha-carboxyethyl hydrochroman	0.3	278.1518	m/z
115.0757	[M + H − H₂O]⁺	6-Hydroxyhexanoic acid	−1.7	132.0786	m/z, MS/MS
251.1408	[M + K]⁺	Trans-11-methyl-2-dodecenoic acid	0.0	212.1776	m/z, MS/MS
166.0258	[M + K]⁺	1-Piperideine-2-carboxylic acid	−4.2	127.0633	m/z, computed MS/MS
73.0649	[M + H]⁺	Isobutylaldehyde	1.4	72.0575	m/z, MS/MS
400.2321	[M + H]⁺	Carnitine 13:3;O3	2.2	399.2257	m/z
202.0685	[M + Na]⁺	Glucosamine	−0.5	179.0793	m/z
343.1228	[M + H]⁺	Alpha-Lactose	−2.0	342.1162	m/z, MS/MS

Triangulation of machine learning models for results validation

Four machine learning models: RF, elastic net, PLS, and SVM were tested by recursive feature elimination and LOOCV (Fig. 4). In RF, a maximum predictive accuracy of 100% was achieved with 15 selected variables. Elastic net, PLS and SVM reached the highest accuracies of 95.2% with 10 selected variables (Table S1†). Comparing the top 10 most important variables (Table S2†) selected by each ML model with the discriminative ions from the conventional workflow based on OPLS and t-test (Table 1), we observed high similarities between variable selections. Interestingly, results from PLS and OPLS methods were in full agreement, which is probably because the models are constructed based on similar concepts.⁵⁵ From the 12 discriminative metabolites discovered in the conventional workflow, m/z 202.0685 (glucosamine) and m/z 343.1228 (alpha-lactose) were selected only when (O)PLS was applied (i.e., model-dependent). Therefore, it is likely that they may not truly associate with the disease state. The remaining 10 metabolites were selected as predictor variables in multiple distinct models (Fig. 5).


	Fig. 4 Evaluation of the prediction accuracies of four ML models (A) random forest (RF), (B) elastic net, (C) support vector machine (SVM), (D) partial least squares (PLS) using leave-one-out cross validation procedure.


	Fig. 5 Box plots show the relative abundance of discriminative metabolites from day 8 between healthy and lame cows determined by OPLS-DA and Student's t-test (VIP score > 1, p < 0.05). Validating the results by triangulation of multiple machine learning models, we identified two model-dependent predictors, alpha-lactose and glucosamine, which were chosen as “important” predictor only in PLS-based methods, therefore, not likely to be “true” predictors.

Lipids are important metabolic fuel, and they have various functions in cell activation, immune response and inflammation.¹⁴ In this study, a few fatty acids in milk were discovered to play an important role in discriminating the lame and healthy cows. From lame cows, a relative decrease was observed in a saturated long-chain fatty acid (hexadecanedioic acid) and an unsaturated fatty acid (trans-11-methyl-2-dodecenoic acid) compared with healthy cows. In contrast, the lame cows had a relatively elevated abundance in an omega-hydroxy fatty acid called 6-hydroxyhexanoic acid. Other significantly altered lipids were phosphatidylglycerol PG 35:4, and fatty acyl carnitine CAR 13:3;O3, which both had a decrease in the lame group compared to the healthy group. In previously reported studies using plasma and urine samples, distinct metabolite profiles between lame cows and controls were displayed by a number of acylcarnitines and glycerophospholipids.^15,57 The alteration in these lipid species were linked to inflammation and immune response. For acylcarnitines, they also play an important role in the lipid β-oxidation process.⁵⁷

Interestingly, while most reported lipid markers in serum or plasma displayed elevated concentration in lame cows, we discovered many lipid predictors with decreased abundance in milk in this study. Further investigation is required to determine the underlying reasons for these alterations in milk. In addition, we observed an increase in 1-piperideine-2-carboxylic acid, which is a metabolite in the pipecolic acid pathway of lysine degradation.⁵⁸ In dairy cows, lysine is important for milk protein synthesis, carnitine synthesis, weight gain in growing cattle, and incorporation into mammalian tissues for structural integrity.⁵⁹ The increased 1-piperideine-2-carboxylic acid in milk may indicate abnormal lysine metabolism in lame cows. Another significantly increased small molecule in lame cows is isobutylaldehyde. Its role in bovine metabolism is not yet fully elucidated.

Selection of the most stable predictors

High variability of results and low reproducibility is a common issue with conventional regression methods (i.e., a single, non-bootstrapped regression model) for covariate selection from high dimensional data in comparison to stability selection.^27–30 In our study, a majority of discriminative metabolites discovered in single machine learning models were not repeatedly selected during bootstrap resampling followed by one-off regression analyses indicated by their low stability scores (Table S3†). Bootstrap resampling is a statistical test that uses random sampling with replacement to mimic real-world sampling process. For instance, isobutylaldehyde was only selected as a “discriminative” metabolite in 9.6% to 36.6% resamples depending on the model types, which means it is highly likely that this metabolite will not be identified as a potential marker in another study where a single one-off regression model is employed. This can make biomarker screening challenging because the selected predictors may be incomparable between studies or analyses and fail to represent the target population. A solution to this issue is stability selection. The concept is that the variables truly associated with the outcome of interest are likely to be selected most frequently during multiple bootstrap resampling.⁶⁰ Here, three penalised models elastic net, MCP and Lasso were implemented for selecting the most stable predictor metabolites.⁶¹ Selection stability was estimated for all models using a bootstrap methodology (500 bootstraps, 20 permutations, 50 permutation bootstraps).³⁰ In each model, variables with a stability score above the estimated threshold and a low bootstrap p-value were selected (Fig. 6). The stability selected variables were m/z 401.2358 (PG 35:4), m/z 315.0416 and m/z 115.0757 (6-hydroxyhexanoic acid, C₆H₁₂O₃) using the default elastic net model, which have also been selected using the triangulation method of OPLS-DA and ML as discussed in the previous section. It is noteworthy that m/z 401.2358 (PG 35:4) not only showed the highest stability scores in three stability models, but also had the highest importance rankings (top 2) in all ML models. Therefore, it appears to be the strongest candidate as an indicator of disease state. A Receiver Operator Characteristic (ROC) curve analysis was employed to further assess both the sensitivity and specificity performance of the predictor metabolites (Fig. S9†). Metabolite m/z 401.2358 (PG 35:4) showed a superior prediction performance with a sensitivity and specificity of 100%.


	Fig. 6 Variable selection and importance were visualised by plotting selection stability in elastic net, Lasso and minimax convex penalty (MCP) models. Variables that were never selected in any bootstrap or with a stability score below 5% were not shown in the figure. The line on the graph represents the calculated threshold to determine a cut-off for ‘important’ covariates.

Conclusions

A novel analytical workflow for untargeted metabolomics of dried milk spots (DMS) using direct infusion mass spectrometry was developed and it is shown to be a robust and discriminating approach for diagnosing lameness in dairy cows. Some important predictor metabolites have been discovered for the first time using the triangulation method of multiple statistical models including OPLS-DA, ML models and stability selection. This statistical workflow allowed identification of the most promising candidates for indicating lameness and eliminating model-dependent “predictors”, which vastly increased the reliability of the outcome. Phosphatidylglycerol and fatty acid species were found to be strong and sensitive candidates as indicators of lameness. Furthermore, we showed that Whatman® FTA® DMPK paper cards, a new sample media for milk collection, can be used for cost-effective and fast veterinary screening because it omits the need for temperature regulation often required by conventional liquid samples transportation and storage. DMS samples from healthy and lame cows can be clearly distinguished by their metabolite profiles after storing at room temperature for up to 8 days. This opens new opportunities to perform large-scale routine diagnosis for lameness, using milk as a sample that farmers can easily collect at low cost.

This experiment is a proof-of-concept study exploring the use of DMS as sample matrix for studying lameness by using untargeted metabolomics method. We acknowledge that the number of lame cows included in this study was low, and all cows were from the same farm. Future work should include larger cohorts of animals from multiple farms to further validate the current findings and determine the underlying reasons for observed metabolic alterations. Furthermore, future work should include the study of DMS samples from pre-lame cows, to determine whether this workflow can be used to predict lameness, and diagnose earlier than the current method which relies on the physical signs of lameness being apparent. This could then pave the way for early interventions in the future.

This developed analytical workflow and statistical strategy can also be applied to explore a wide range of diseases using dried liquid samples such as milk, blood and urine as a fast and robust untargeted method to determine the presence of potential biomarkers in the sample of choice.

Conflicts of interest

There are no conflicts to declare.

Acknowledgements

This work was supported by the BBSRC [BB/T0083690/1]; the Academy of Medical Sciences, Starter Grants for Clinical Lecturers scheme [SGL023_1096]. Dr Griffiths is the recipient of a University of Nottingham independent Fellowship through their Anne McLaren Fellowship scheme.

References

A. Mineur, H. Hammami, C. Grelet, C. Egger-Danner, J. Sölkner and N. Gengle, Short communication: Investigation of the temporal relationships between milk mid-infrared predicted biomarkers and lameness events in later lactation, J. Dairy Sci., 2020, 103(5), 4475–4482 CrossRef CAS PubMed.
L. V. Randall, H. J. Thomas, J. G. Remnant, N. J. Bollard and J. N. Huxley, Lameness prevalence in a random sample of UK dairy herds, Vet. Rec., 2019, 184(11), 350 CrossRef PubMed.
B. E. Griffiths, D. G. White and G. Oikonomou, A cross-sectional study into the prevalence of dairy cattle lameness and associated herd-level risk factors in England and Wales, Front. Vet. Sci., 2018, 5(APR), 1–8 Search PubMed.
S. Pedersen and J. Wilson, Early detection and prompt effective treatment of lameness in dairy cattle, Livestock, 2021, 26(3), 115–121 CrossRef.
S. Archer, N. Bell and J. Huxley, Lameness in UK dairy cows: A review of the current status, In Pract., 2010, 32(10), 492–504 CrossRef.
K. A. O'Callaghan, P. J. Cripps, D. Y. Downham and R. D. Murray, Subjective and objective assessment of pain and discomfort due to lameness in dairy cattle, Anim. Welfare, 2003, 12(4), 605–610 Search PubMed.
P. T. Thomsen, L. Munksgaard and F. A. Togersen, Evaluation of a lameness scoring system for dairy cows, J. Dairy Sci., 2008, 91(1), 119–126 CrossRef CAS PubMed.
G. Zhang, D. Hailemariam, E. Dervishi, Q. Deng, S. A. Goldansa and S. M. Dunn, et al., Alterations of innate immunity reactants in transition dairy cows before clinical signs of lameness, Animals, 2015, 5(3), 717–747 CrossRef PubMed.
E. M. Tadro, N. Frank and D. W. Horohov, Inflammatory cytokine gene expression in blood during the development of oligofructose-induced laminitis in horses, J. Equine Vet. Sci., 2013, 33(10), 802–808 CrossRef.
C. H. Johnson, J. Ivanisevic and G. Siuzdak, Metabolomics: beyond biomarkers and towards mechanisms, Nat. Rev. Mol. Cell Biol., 2016, 17, 451–459 CrossRef CAS PubMed.
J. Meurs, D. J. Scurr, A. Lourdusamy, L. C. D. Storer, R. G. Grundy and M. R. Alexander, et al., Sequential Orbitrap Secondary Ion Mass Spectrometry and Liquid Extraction Surface Analysis-Tandem Mass Spectrometry-Based Metabolomics for Prediction of Brain Tumor Relapse from Sample-Limited Primary Tissue Archives, Anal. Chem., 2021, 93(18), 6947–6954 CrossRef CAS PubMed.
S. A. Goldansaz, A. C. Guo, T. Sajed, M. A. Steele, G. S. Plastow and D. S. Wishart, Livestock metabolomics and the livestock metabolome: A systematic review, PLoS One, 2017, 12(5), e0177675 CrossRef PubMed.
J. Zheng, L. Sun, S. Shu, K. Zhu, C. Xu and J. Wang, et al., Nuclear magnetic resonance-based serum metabolic profiling of dairy cows with footrot, J. Vet. Med. Sci., 2016, 78(9), 1421–1428 CrossRef CAS PubMed.
E. Dervishi, G. Zhang, G. Zwierzchowski, R. Mandal, D. S. Wishart and B. N. Ametaj, Serum metabolic fingerprinting of pre-lameness dairy cows by GC–MS reveals typical profiles that can identify susceptible cows, J. Proteomics, 2020, 213(September 2019), 103620 CrossRef CAS PubMed.
E. F. Eckel, G. Zhang, E. Dervishi, G. Zwierzchowski, R. Mandal and D. S. Wishart, et al., Urinary metabolomics fingerprinting around parturition identifies metabolites that differentiate lame dairy cows from healthy ones, Animal, 2020, 14(10), 2138–2149 CrossRef CAS PubMed.
S. Furse and A. Koulman, Lipid extraction from dried blood spots and dried milk spots for untargeted high throughput lipidomics, Mol. Omics, 2020, 16(6), 563–572 RSC.
J. Saito, N. Yakuwa, K. Kaneko, K. Nakajima, C. Takai and M. Goto, et al., Clinical application of the dried milk spot method for measuring tocilizumab concentrations in the breast milk of patients with rheumatoid arthritis, Int. J. Rheum. Dis., 2019, 22(6), 1130–1137 CAS.
K. H. Jackso, J. Polreis, L. Sanborn, D. Chaima and W. S. Harris, Analysis of breast milk fatty acid composition using dried milk samples, Int. Breastfeed J., 2016, 11(1), 1–7 CrossRef PubMed.
C. Gao, R. A. Gibson, A. J. Mcphee, S. J. Zhou, C. T. Collins and M. Makrides, et al., Comparison of breast milk fatty acid composition from mothers of premature infants of three countries using novel dried milk spot technology, Prostaglandins, Leukotrienes Essent. Fatty Acids, 2018, 139(August), 3–8 CrossRef CAS PubMed.
Z. Hall, Z. Ament, C. H. Wilson, D. L. Burkhart, T. Ashmore and A. Koulman, et al., MYC expression drives aberrant lipid metabolism in lung cancer, Cancer Res., 2016, 76(16), 4608–4618 CrossRef CAS PubMed.
A. Surrati, R. Linforth, I. D. Fisk, V. Sottile and D. H. Kim, Non-destructive characterisation of mesenchymal stem cell differentiation using LC-MS-based metabolite footprinting, Analyst, 2016, 141(12), 3776–3787 RSC.
S. Abdelrazig, L. Safo, G. Rance, M. Fay, E. Theodosiou and P. Topham, et al., Metabolic Characterisation of Magnetospirillum Gryphiswaldense MSR-1 Using LC-MS-Based Metabolite Profiling, RSC Adv., 2020, 10, 32548–23560 RSC.
B. Worley and R. Powers, Multivariate Analysis in Metabolomics, Curr. Metabolomics, 2013, 1(1), 92–107 CAS.
B. Worley and R. Powers, PCA as a predictor of OPLS-DA model reliability, Curr. Metabolomics, 2016, 4(2), 97–103 CrossRef CAS PubMed.
V. Amrhein, S. Greenland and B. McShane, Retire statistical significance, Nature, 2019, 567(4), 305–307 CrossRef CAS PubMed.
K. E. Lewis, M. J. Green, J. Witt and L. E. Green, Multiple model triangulation to identify factors associated with lameness in British sheep flocks, Prev. Vet. Med., 2021, 193(June), 105395 CrossRef CAS PubMed.
N. Meinshausen and P. Bühlmann, Stability selection, J. R. Stat. Soc., B: Stat. Methodol., 2010, 72(4), 417–473 CrossRef.
L. Baldassarr, M. Pontil and J. Mourão-Miranda, Sparsity is better with stability: Combining accuracy and stability for model selection in brain decoding, Front. Neurosci., 2017, 11, 62 Search PubMed.
R. M. Hyde, M. J. Green, C. Hudson and P. M. Down, Factors associated with daily weight gain in preweaned calves on dairy farms, Prev. Vet. Med., 2021, 190(April 2020), 105320 CrossRef PubMed.
E. Lima, R. Hyde and M. Green, Model selection for inferential models with high dimensional data: synthesis and graphical representation of multiple techniques, Sci. Rep., 2021, 11(1), 1–16 CrossRef PubMed.
V. Kertesz and G. J. Van Berkel, Fully automated liquid extraction-based surface sampling and ionization using a chip-based robotic nanoelectrospray platform, J. Mass Spectrom., 2010, 45(3), 252–260 CrossRef CAS PubMed.
Mobility scoring: how to score your cows. Available from: https://ahdb.org.uk/knowledge-library/mobility-scoring-how-to-score-your-cows.
O. P. Trifonov, D. L. Maslov, E. E. Balashova and P. G. Lokhov, Evaluation of dried blood spot sampling for clinical metabolomics: Effects of different papers and sample storage stability, Metabolites, 2019, 9(11), 277 CrossRef PubMed.
D. Kessner, M. Chambers, R. Burke, D. Agus and P. Mallick, ProteoWizard: Open source software for rapid proteomics tools development, Bioinformatics, 2008, 24(21), 2534–2536 CrossRef CAS PubMed.
R. Di Guida, J. Engel, J. W. Allwood, R. J. M. Webe, M. R. Jones and U. Sommer, et al., Non-targeted UHPLC-MS metabolomic data processing methods: a comparative investigation of normalisation, missing value imputation, transformation and scaling, Metabolomics, 2016, 12(93), 1–14 CAS.
V. Pirro, P. Oliveri, C. R. Ferreira, A. F. González-Serrano, Z. Machaty and R. G. Cooks, Lipid characterization of individual porcine oocytes by dual mode DESI-MS and data fusion, Anal. Chim. Acta, 2014, 848, 51–60 CrossRef CAS PubMed.
Z. Hall, Y. Chu and J. L. Griffin, Liquid Extraction Surface Analysis Mass Spectrometry Method for Identifying the Presence and Severity of Nonalcoholic Fatty Liver Disease, Anal. Chem., 2017, 89(9), 5161–5170 CrossRef CAS PubMed.
Y. Benjamini, A. Krieger and D. Yekutieli, Adaptive linear step-up procedures that control the false discovery rate, Biometrika, 2006, 93(3), 491–507 CrossRef.
Z. Pang, J. Chong, G. Zhou, D. A. De Lima Morais, L. Chang and M. Barrette, et al., MetaboAnalyst 5.0: Narrowing the gap between raw spectra and functional insights, Nucleic Acids Res., 2021, 49(W1), W388–W396 CrossRef CAS PubMed.
R Core Team, R: A language and environment for statistical computing, R Foundation for Statistical Computing, Vienna, Austria, 2021, Available from: https://www.r-project.org/.
A. Liaw and M. Wiener, Classification and Regression by Randomforest, R News, 2002, 2(3), 18–22. Available from: https://cran.r-project.org/doc/Rnews/.
H. Zou and T. Hastie, Elasticnet: Elastic-Net for Sparse Estimation and Sparse PCA, R package version 1.3, 2020. Available from: https://cran.r-project.org/package=elasticnet.
K. H. Liland, B.-H. Mevik and R. Wehrens, Pls: Partial Least Squares and Principal Component Regression, R package version 2.8-0, 2021. Available from: https://cran.r-project.org/package=pls.
A. Karatzoglou, A. Smola, K. Hornik and A. Zeileis, Kernlab – An {S4} Package for Kernel Methods in {R}, J. Stat. Softw., 2004, 11(9), 1–20 Search PubMed . Available from: https://www.jstatsoft.org/v11/i09/.
C. Ambroise and G. J. McLachlan, Selection bias in gene extraction on the basis of microarray gene-expression data, Proc. Natl. Acad. Sci. U. S. A., 2002, 99(10), 6562–6566 CrossRef CAS PubMed.
R. Hyde, M. Green and E. Lima, Stabiliser: Stabilising Variable Selection, R package version 0.1.2, 2022. Available from: https://cran.r-project.org/package=stabiliser Search PubMed.
P. Breheny and J. Huang, Coordinate descent algorithms for nonconvex penalized regression, with applications to biological feature selection, Ann. Appl. Stat., 2011, 5, 232–253 Search PubMed.
J. Friedman, T. Hastie and R. Tibshirani, Regularization Paths for Generalized Linear Models via Coordinate Descent, J. Stat. Softw., 2010, 33(1), 1–22 Search PubMed . Available from: https://www.jstatsoft.org/v33/i01/.
B. Efron and R. J. Tibshirani, An Introduction to the Bootstrap, Chapman & Hall, 1994 Search PubMed.
M. Green, E. Lima and R. M. Hyde, Selection Stability in High Dimensional Statistical Modelling: Defining a Threshold for Robust Model Inference, Res. Sq., 2021 DOI:10.21203/rs.3.rs-738092/v1.
A. Foroutan, C. Fitzsimmons, R. Mandal, H. Piri-moghadam, J. Zheng and A. Guo, et al., The bovine metabolome, Metabolites, 2020, 10(6), 1–26 CrossRef PubMed.
E. Fahy, M. Sud, D. Cotter and S. Subramaniam, LIPID MAPS online tools for lipid research, Nucleic Acids Res., 2007, 35(Suppl 2), 606–612 CrossRef PubMed.
F. Wang, J. Liigand, S. Tian, D. Arndt, R. Greiner and D. S. Wishart, CFM-ID 4.0: More Accurate ESI-MS/MS Spectral Prediction and Compound Identification, Anal. Chem., 2021, 93(34), 11692–11700 CrossRef CAS PubMed.
A. K. Smilde, M. J. Van Der Werf, S. Bijlsma, B. J. C. Van Der Werff-Van Der Vat and R. H. Jellema, Fusion of mass spectrometry-based metabolomics data, Anal. Chem., 2005, 77(20), 6729–6736 CrossRef CAS PubMed.
M. N. Triba, L. Le Moyec, R. Amathieu, C. Goossens, N. Bouchemal and P. Nahon, et al., PLS/OPLS models in metabolomics: The impact of permutation of dataset rows on the K-fold cross-validation quality parameters, Mol. BioSyst., 2015, 11(1), 13–19 RSC.
S. Wiklund, E. Johansson, L. Sjöström, E. J. Mellerowicz, U. Edlund and J. P. Shockcor, et al., Visualization of GC/TOF-MS-based metabolomics data for identification of biochemically interesting compounds using OPLS class models, Anal. Chem., 2008, 80(1), 115–122 CrossRef CAS PubMed.
D. Hailemariam, R. Mandal, F. Saleem, S. M. Dunn, D. S. Wishart and B. N. Ametaj, Identification of predictive biomarkers of disease state in transition dairy cows, J. Dairy Sci., 2014, 97(5), 2680–2693 CrossRef CAS PubMed.
J. Leandro and S. M. Houten, The lysine degradation pathway: Subcellular compartmentalization and enzyme deficiencies, Mol. Genet. Metab., 2020, 131(1–2), 14–22 CrossRef CAS PubMed.
M. A. Fagundes, J. O. Hall and J. S. Eun, Effects of feeding different forms of lysine supplements on lactational performance and nitrogen utilization by mid- to late-lactation dairy cows, Appl. Anim. Sci., 2022, 38(1), 1–12 CrossRef.
B. Efron, Bootstrap Methods: Another Look at the Jackknife, Ann. Stat., 1979, 7, 1–26 Search PubMed.
M. Kuhn and K. Johnson, Applied Predictive Modeling, Springer, New York, 1st edn, 2013, pp. 122–128 Search PubMed.

Footnote

† Electronic supplementary information (ESI) available. See DOI: https://doi.org/10.1039/d2an01520j

Click here to see how this site uses Cookies. View our privacy policy here.