1. Pairs and Pairix: a file format and a tool for efficient storage and retrieval for Hi-C read pairs.
- Author
-
Lee, Soohyun, Bakker, Clara R, Vitzthum, Carl, Alver, Burak H, and Park, Peter J
- Subjects
- *
TEXT files , *SOURCE code , *STORAGE , *PYTHON programming language - Abstract
Summary As the amount of 3D chromosomal interaction data continues to increase, storing and accessing such data efficiently becomes paramount. We introduce Pairs, a block-compressed text file format for storing paired genomic coordinates from Hi-C data, and Pairix, an open-source C application to index and query Pairs files. Pairix (also available in Python and R) extends the functionalities of Tabix to paired coordinates data. We have also developed PairsQC, a collapsible HTML quality control report generator for Pairs files. Availability and implementation The format specification and source code are available at https://github.com/4dn-dcic/pairix , https://github.com/4dn-dcic/Rpairix and https://github.com/4dn-dcic/pairsqc. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF