VOCCluster: Untargeted Metabolomics Feature Clustering Approach for Clinical Breath Gas Chromatography - Mass Spectrometry Data

Yaser Alkhalifah, Iain Phillips, Andrea Soltoggio. Kareen Darnley, William H Nailon, Duncan McLaren, Michael Eddleston, C. L. Paul Thomas and Dahlia Salman.

Department of Chemistry, Loughborough University, United Kingdom.

First Published: Analytical Chemistry December 2nd 2019. (https://doi.org/10.1021/acs.analchem.9b03084).

Abstract

Metabolic profiling of breath analysis involves processing, alignment, scaling and clustering of thousands of features ex-tracted from Gas Chromatography Mass spectrometry (GC-MS) data from hundreds of participants. The multi-step data processing is complicated, operator error-prone and time-consuming. Automated algorithmic clustering methods that are able to cluster features in a fast and reliable way are necessary. These accelerate metabolic profiling and discovery plat-forms for next generation medical diagnostic tools. Our unsupervised clustering technique, VOCCluster, prototyped in Py-thon, handles features of deconvolved GC-MS breath data. VOCCluster was created from a heuristic ontology based on the observation of experts undertaking data processing with a suite of software packages. VOCCluster identifies and clusters groups of volatile organic compounds (VOCs) from deconvolved GC-MS breath with similar mass spectra and retention index profiles. VOCCluster was used to cluster more than 15,000 features extracted from 74 GC-MS clinical breath samples obtained from participants with cancer before and after a radiation therapy. Results were evaluated against a panel of ground truth compounds and compared to other clustering methods (DBSCAN and OPTICS) that were used in previous metabolomics studies. VOCCluster was able to cluster those features into 1081 groups (including endogenous, exogenous compounds and instrumental artefacts) with an accuracy rate of 96% (± 0.04 at 95% confidence interval).

View the full article at https://doi.org/10.1021/acs.analchem.9b03084.

 

Our Software

AnalyzerPro

AnalyzerPro

AnalyzerPro® is a productivity software application for LC-MS and GC-MS data with support for multiple vendors’ data. This comprehensive post-processing utility provides optimized workflows for sample-to-sample comparison, target component analysis, quantitation and library searching for data generated from any LC-MS and GC-MS platform. Using its proprietary algorithms to detect obscured components that existing software packages are unable to find without .....

AnalyzerPro XD

AnalyzerPro XD

AnalyzerPro® XD is the latest version of our productivity software application now available with support for two dimensional chromatography. Two-dimensional chromatography is a powerful analy­tical tool that has evolved from technology used mainly in the R&D laboratory to a robust commercially available soluti­on from several manufacturers. The technology continues to evolve to address challenges in the analysis of complex samples...

RemoteAnalyzer

RemoteAnalyzer

RemoteAnalyzer is the only open access software solution for today’s fast-paced laboratory. It is designed to operate and optimize the management of multiple locations, multiple types of analyses, multiple scientists of the entire skill set range as well as multiple instrument types from a variety of different vendors. See a case study from the University of Durham regarding their implementation. The walk-up ......

NIST 20 MS and MS/MS Libraries

NIST 20 MS and MS/MS Libraries

The NIST/EPA/NIH Mass Spectral Library with Search Program is the standard MS spectra reference database. This library is available with version 2.4 of the full-featured NIST MS Search Program for Windows and includes ....