Sergio Coronado, Valentin Plugaru, Piotr Gawron, Laurens van der Maaten, Cedric Christian Laczny, Houry Hera Margossian, Paul Wilmes, Arash Atashpendar, Tomasz Sternal, Nikos Vlassis, Fonds National de la Recherche - FnR [sponsor], Luxembourg Centre for Systems Biomedicine (LCSB): Eco-Systems Biology (Wilmes Group) [research center], Luxembourg Centre for Systems Biomedicine (LCSB): Bioinformatics Core (R. Schneider Group) [research center], and Luxembourg Centre for Systems Biomedicine (LCSB): Machine Learning (Vlassis Group) [research center]
Background Metagenomics is limited in its ability to link distinct microbial populations to genetic potential due to a current lack of representative isolate genome sequences. Reference-independent approaches, which exploit for example inherent genomic signatures for the clustering of metagenomic fragments (binning), offer the prospect to resolve and reconstruct population-level genomic complements without the need for prior knowledge. Results We present VizBin, a Java™-based application which offers efficient and intuitive reference-independent visualization of metagenomic datasets from single samples for subsequent human-in-the-loop inspection and binning. The method is based on nonlinear dimension reduction of genomic signatures and exploits the superior pattern recognition capabilities of the human eye-brain system for cluster identification and delineation. We demonstrate the general applicability of VizBin for the analysis of metagenomic sequence data by presenting results from two cellulolytic microbial communities and one human-borne microbial consortium. The superior performance of our application compared to other analogous metagenomic visualization and binning methods is also presented. Conclusions VizBin can be applied de novo for the visualization and subsequent binning of metagenomic datasets from single samples, and it can be used for the post hoc inspection and refinement of automatically generated bins. Due to its computational efficiency, it can be run on common desktop machines and enables the analysis of complex metagenomic datasets in a matter of minutes. The software implementation is available at https://claczny.github.io/VizBin under the BSD License (four-clause) and runs under Microsoft Windows™, Apple Mac OS X™ (10.7 to 10.10), and Linux. Electronic supplementary material The online version of this article (doi:10.1186/s40168-014-0066-1) contains supplementary material, which is available to authorized users.