Author: "Kristin C Gunsalus" / Journal: bmc bioinformatics - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kristin C Gunsalus"' showing total 3 results

Start Over Author "Kristin C Gunsalus" Journal bmc bioinformatics

3 results on '"Kristin C Gunsalus"'

1. Pheniqs 2.0: accurate, high-performance Bayesian decoding and confidence estimation for combinatorial barcode indexing

Author: Lior Galanti, Dennis Shasha, and Kristin C. Gunsalus
Subjects: QH301-705.5, Computer science, Computer applications to medicine. Medical informatics, Posterior probability, Bayesian probability, R858-859.7, Barcode, Single-cell split-pooling, Biochemistry, law.invention, 03 medical and health sciences, 0302 clinical medicine, Structural Biology, law, Combinatorial indexing, DNA Barcoding, Taxonomic, Barcode simulation, Biology (General), Sequence demultiplexing, Molecular Biology, 030304 developmental biology, Electronic Data Processing, 0303 health sciences, Applied Mathematics, Search engine indexing, Barcode decoding confidence, Probabilistic logic, High-Throughput Nucleotide Sequencing, Reproducibility of Results, Bayes Theorem, Sequence Analysis, DNA, Computer Science Applications, Edit distance, Barcode noise filtering, Precision and recall, Algorithm, Software, 030217 neurology & neurosurgery, Decoding methods
Abstract: Background Systems biology increasingly relies on deep sequencing with combinatorial index tags to associate biological sequences with their sample, cell, or molecule of origin. Accurate data interpretation depends on the ability to classify sequences based on correct decoding of these combinatorial barcodes. The probability of correct decoding is influenced by both sequence quality and the number and arrangement of barcodes. The rising complexity of experimental designs calls for a probability model that accounts for both sequencing errors and random noise, generalizes to multiple combinatorial tags, and can handle any barcoding scheme. The needs for reproducibility and community benchmark standards demand a peer-reviewed tool that preserves decoding quality scores and provides tunable control over classification confidence that balances precision and recall. Moreover, continuous improvements in sequencing throughput require a fast, parallelized and scalable implementation. Results and discussion We developed a flexible, robustly engineered software that performs probabilistic decoding and supports arbitrarily complex barcoding designs. Pheniqs computes the full posterior decoding error probability of observed barcodes by consulting basecalling quality scores and prior distributions, and reports sequences and confidence scores in Sequence Alignment/Map (SAM) fields. The product of posteriors for multiple independent barcodes provides an overall confidence score for each read. Pheniqs achieves greater accuracy than minimum edit distance or simple maximum likelihood estimation, and it scales linearly with core count to enable the classification of > 11 billion reads in 1 h 15 m using < 50 megabytes of memory. Pheniqs has been in production use for seven years in our genomics core facility. Conclusion We introduce a computationally efficient software that implements both probabilistic and minimum distance decoders and show that decoding barcodes using posterior probabilities is more accurate than available methods. Pheniqs allows fine-tuning of decoding sensitivity using intuitive confidence thresholds and is extensible with alternative decoders and new error models. Any arbitrary arrangement of barcodes is easily configured, enabling computation of combinatorial confidence scores for any barcoding strategy. An optimized multithreaded implementation assures that Pheniqs is faster and scales better with complex barcode sets than existing tools. Support for POSIX streams and multiple sequencing formats enables easy integration with automated analysis pipelines.
Published: 2021
Full Text: View/download PDF

2. NASQAR: a web-based platform for high-throughput sequencing data analysis and visualization

Author: Mohammed Khalfan, Kristin C. Gunsalus, Nizar Drou, Ayman Yousif, and Jillian Rowe
Subjects: Computer science, Dynamic web page, lcsh:Computer applications to medicine. Medical informatics, Biochemistry, World Wide Web, User-Computer Interface, 03 medical and health sciences, Exploratory data analysis, Resource (project management), 0302 clinical medicine, Software, Structural Biology, Web application, RNA-Seq, Transcriptomics, lcsh:QH301-705.5, Molecular Biology, Interactive visualization, 030304 developmental biology, Internet, 0303 health sciences, business.industry, Gene Expression Profiling, Applied Mathematics, High-Throughput Nucleotide Sequencing, Genomics, Computer Science Applications, Visualization, Graphical user interface, lcsh:Biology (General), Metagenomics, lcsh:R858-859.7, Data pre-processing, business, 030217 neurology & neurosurgery
Abstract: Background As high-throughput sequencing applications continue to evolve, the rapid growth in quantity and variety of sequence-based data calls for the development of new software libraries and tools for data analysis and visualization. Often, effective use of these tools requires computational skills beyond those of many researchers. To ease this computational barrier, we have created a dynamic web-based platform, NASQAR (Nucleic Acid SeQuence Analysis Resource). Results NASQAR offers a collection of custom and publicly available open-source web applications that make extensive use of a variety of R packages to provide interactive data analysis and visualization. The platform is publicly accessible at http://nasqar.abudhabi.nyu.edu/. Open-source code is on GitHub at https://github.com/nasqar/NASQAR, and the system is also available as a Docker image at https://hub.docker.com/r/aymanm/nasqarall. NASQAR is a collaboration between the core bioinformatics teams of the NYU Abu Dhabi and NYU New York Centers for Genomics and Systems Biology. Conclusions NASQAR empowers non-programming experts with a versatile and intuitive toolbox to easily and efficiently explore, analyze, and visualize their Transcriptomics data interactively. Popular tools for a variety of applications are currently available, including Transcriptome Data Preprocessing, RNA-seq Analysis (including Single-cell RNA-seq), Metagenomics, and Gene Enrichment.
Published: 2020
Full Text: View/download PDF

3. Erratum to: MINE: Module Identification in Networks

Author: Kristin C. Gunsalus and Kahn Rhrissorrakrai
Subjects: 0301 basic medicine, Computer science, Brute-force search, Saccharomyces cerevisiae, Computational biology, computer.software_genre, Biochemistry, 03 medical and health sciences, Structural Biology, Cluster (physics), Animals, Cluster Analysis, Caenorhabditis elegans, Molecular Biology, Applied Mathematics, Published Erratum, Process (computing), Proteins, Regret, Computer Science Applications, Identification (information), 030104 developmental biology, Node (circuits), Data mining, Erratum, computer, Algorithms
Abstract: Graphical models of network associations are useful for both visualizing and integrating multiple types of association data. Identifying modules, or groups of functionally related gene products, is an important challenge in analyzing biological networks. However, existing tools to identify modules are insufficient when applied to dense networks of experimentally derived interaction data. To address this problem, we have developed an agglomerative clustering method that is able to identify highly modular sets of gene products within highly interconnected molecular interaction networks.MINE outperforms MCODE, CFinder, NEMO, SPICi, and MCL in identifying non-exclusive, high modularity clusters when applied to the C. elegans protein-protein interaction network. The algorithm generally achieves superior geometric accuracy and modularity for annotated functional categories. In comparison with the most closely related algorithm, MCODE, the top clusters identified by MINE are consistently of higher density and MINE is less likely to designate overlapping modules as a single unit. MINE offers a high level of granularity with a small number of adjustable parameters, enabling users to fine-tune cluster results for input networks with differing topological properties.MINE was created in response to the challenge of discovering high quality modules of gene products within highly interconnected biological networks. The algorithm allows a high degree of flexibility and user-customisation of results with few adjustable parameters. MINE outperforms several popular clustering algorithms in identifying modules with high modularity and obtains good overall recall and precision of functional annotations in protein-protein interaction networks from both S. cerevisiae and C. elegans.
Published: 2016
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

3 results on '"Kristin C Gunsalus"'

1. Pheniqs 2.0: accurate, high-performance Bayesian decoding and confidence estimation for combinatorial barcode indexing

2. NASQAR: a web-based platform for high-throughput sequencing data analysis and visualization

3. Erratum to: MINE: Module Identification in Networks

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

Publisher

3 results on '"Kristin C Gunsalus"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources