1. TKSM: highly modular, user-customizable, and scalable transcriptomic sequencing long-read simulator.
- Author
-
Karaoğlanoğlu F, Orabi B, Flannigan R, Chauve C, and Hach F
- Subjects
- Computer Simulation, RNA, Gene Expression Profiling, High-Throughput Nucleotide Sequencing methods, Software
- Abstract
Motivation: Transcriptomic long-read (LR) sequencing is an increasingly cost-effective technology for probing various RNA features. Numerous tools have been developed to tackle various transcriptomic sequencing tasks (e.g. isoform and gene fusion detection). However, the lack of abundant gold-standard datasets hinders the benchmarking of such tools. Therefore, the simulation of LR sequencing is an important and practical alternative. While the existing LR simulators aim to imitate the sequencing machine noise and to target specific library protocols, they lack some important library preparation steps (e.g. PCR) and are difficult to modify to new and changing library preparation techniques (e.g. single-cell LRs)., Results: We present TKSM, a modular and scalable LR simulator, designed so that each RNA modification step is targeted explicitly by a specific module. This allows the user to assemble a simulation pipeline as a combination of TKSM modules to emulate a specific sequencing design. Additionally, the input/output of all the core modules of TKSM follows the same simple format (Molecule Description Format) allowing the user to easily extend TKSM with new modules targeting new library preparation steps., Availability and Implementation: TKSM is available as an open source software at https://github.com/vpc-ccg/tksm., (© The Author(s) 2024. Published by Oxford University Press.)
- Published
- 2024
- Full Text
- View/download PDF