Back to Search Start Over

ZODIAC: database-independent molecular formula annotation using Gibbs sampling reveals unknown small molecules

Authors :
Kai Dührkop
Lihini I. Aluwihare
Martin Hoffmann
Sebastian Böcker
Mustafa Morsy
Markus Fleischauer
Pieter C. Dorrestein
Louis-Félix Nothias
Irina Koester
Marcus Ludwig
Fernando Vargas
Daniel Petras
Publication Year :
2019
Publisher :
Cold Spring Harbor Laboratory, 2019.

Abstract

1AbstractThe confident high-throughput identification of small molecules remains one of the most challenging tasks in mass spectrometry-based metabolomics. SIRIUS has become a powerful tool for the interpretation of tandem mass spectra, and shows outstanding performance for identifying the molecular formula of a query compound, being the first step of structure identification. Nevertheless, the identification of both molecular formulas for large compounds above 500 Daltons and novelmolecular formulasremains highly challenging. Here, we present ZODIAC, a network-based algorithm for thede novoestimation of molecular formulas. ZODIAC reranks SIRIUS’ molecular formula candidates, combining fragmentation tree computation with Bayesian statistics using Gibbs sampling. Through careful algorithm engineering, ZODIAC’s Gibbs sampling is very swift in practice. ZODIAC decreases incorrect annotations 16.2-fold on a challenging plant extract dataset with most compounds above 700 Dalton; we then show improvements on four additional, diverse datasets. Our analysis led to the discovery of compounds with novel molecular formulas such as C24H47BrNO8P which, as of today, is not present in any publicly available molecular structure databases.

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....19775dd4d2374048f186ba49c1d7259e
Full Text :
https://doi.org/10.1101/842740