Start Over

Pacybara: accurate long-read sequencing for barcoded mutagenized allelic libraries.

Authors :: Weile, Jochen
Ferra, Gabrielle
Boyle, Gabriel
Pendyala, Sriram
Amorosi, Clara
Yeh, Chiann-Ling
Cote, Atina G
Kishore, Nishka
Tabet, Daniel
Loggerenberg, Warren van
Rayhan, Ashyad
Fowler, Douglas M
Dunham, Maitreya J
Roth, Frederick P
Source :: Bioinformatics; Apr2024, Vol. 40 Issue 4, p1-3, 3p
Publication Year :: 2024
Abstract: Motivation Long-read sequencing technologies, an attractive solution for many applications, often suffer from higher error rates. Alignment of multiple reads can improve base-calling accuracy, but some applications, e.g. sequencing mutagenized libraries where multiple distinct clones differ by one or few variants, require the use of barcodes or unique molecular identifiers. Unfortunately, sequencing errors can interfere with correct barcode identification, and a given barcode sequence may be linked to multiple independent clones within a given library. Results Here we focus on the target application of sequencing mutagenized libraries in the context of multiplexed assays of variant effects (MAVEs). MAVEs are increasingly used to create comprehensive genotype-phenotype maps that can aid clinical variant interpretation. Many MAVE methods use long-read sequencing of barcoded mutant libraries for accurate association of barcode with genotype. Existing long-read sequencing pipelines do not account for inaccurate sequencing or nonunique barcodes. Here, we describe Pacybara, which handles these issues by clustering long reads based on the similarities of (error-prone) barcodes while also detecting barcodes that have been associated with multiple genotypes. Pacybara also detects recombinant (chimeric) clones and reduces false positive indel calls. In three example applications, we show that Pacybara identifies and correctly resolves these issues. Availability and implementation Pacybara, freely available at https://github.com/rothlab/pacybara , is implemented using R, Python, and bash for Linux. It runs on GNU/Linux HPC clusters via Slurm, PBS, or GridEngine schedulers. A single-machine simplex version is also available. [ABSTRACT FROM AUTHOR]

Subjects :: MOLECULAR cloning
ERROR rates
BAR codes
LIBRARIES
LIBRARY associations

Details

Language :: English
ISSN :: 13674803
Volume :: 40
Issue :: 4
Database :: Complementary Index
Journal :: Bioinformatics
Publication Type :: Academic Journal
Accession number :: 176933448
Full Text :: https://doi.org/10.1093/bioinformatics/btae182

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Pacybara: accurate long-read sequencing for barcoded mutagenized allelic libraries.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Pacybara: accurate long-read sequencing for barcoded mutagenized allelic libraries.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources