Back to Search Start Over

MACARON: a python framework to identify and re-annotate multi-base affected codons in whole genome/exome sequence data

Authors :
Waqasuddin Khan
Tania Cuppens
Thomas Ludwig
David-Alexandre Trégouët
Emmanuelle Génin
Florian Thibord
Jean-François Deleuze
Ganapathi Varma Saripella
Unité de Recherche sur les Maladies Cardiovasculaires, du Métabolisme et de la Nutrition = Research Unit on Cardiovascular and Metabolic Diseases (ICAN)
Université Pierre et Marie Curie - Paris 6 (UPMC)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Institut National de la Santé et de la Recherche Médicale (INSERM)-CHU Pitié-Salpêtrière [AP-HP]
Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)-Sorbonne Université (SU)
Génétique, génomique fonctionnelle et biotechnologies (UMR 1078) (GGB)
EFS-Université de Brest (UBO)-Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut Brestois Santé Agro Matière (IBSAM)
Université de Brest (UBO)
Centre National de Génotypage (CNG)
Commissariat à l'énergie atomique et aux énergies alternatives (CEA)
Centre National de Recherche en Génomique Humaine (CNRGH)
FREX Consortium
GENMED Consortium
ANR-10-LABX-0013,GENMED,Medical Genomics(2010)
ANR-10-INBS-0009,France-Génomique,Organisation et montée en puissance d'une Infrastructure Nationale de Génomique(2010)
Unité de Recherche sur les Maladies Cardiovasculaires, du Métabolisme et de la Nutrition = Institute of cardiometabolism and nutrition (ICAN)
Sorbonne Université (SU)-Assistance publique - Hôpitaux de Paris (AP-HP) (AP-HP)-Sorbonne Université (SU)
Institut Brestois Santé Agro Matière (IBSAM)
Université de Brest (UBO)-Université de Brest (UBO)-EFS-Institut National de la Santé et de la Recherche Médicale (INSERM)
Source :
Bioinformatics, Bioinformatics, 2018, ⟨10.1093/bioinformatics/bty382⟩, Bioinformatics, Oxford University Press (OUP), 2018, ⟨10.1093/bioinformatics/bty382⟩
Publication Year :
2018
Publisher :
HAL CCSD, 2018.

Abstract

Summary Predicted deleteriousness of coding variants is a frequently used criterion to filter out variants detected in next-generation sequencing projects and to select candidates impacting on the risk of human diseases. Most available dedicated tools implement a base-to-base annotation approach that could be biased in presence of several variants in the same genetic codon. We here proposed the MACARON program that, from a standard VCF file, identifies, re-annotates and predicts the amino acid change resulting from multiple single nucleotide variants (SNVs) within the same genetic codon. Applied to the whole exome dataset of 573 individuals, MACARON identifies 114 situations where multiple SNVs within a genetic codon induce an amino acid change that is different from those predicted by standard single SNV annotation tool. Such events are not uncommon and deserve to be studied in sequencing projects with inconclusive findings. Availability and implementation MACARON is written in python with codes available on the GENMED website (www.genmed.fr). Supplementary information Supplementary data are available at Bioinformatics online.

Details

Language :
English
ISSN :
13674803 and 13674811
Database :
OpenAIRE
Journal :
Bioinformatics, Bioinformatics, 2018, ⟨10.1093/bioinformatics/bty382⟩, Bioinformatics, Oxford University Press (OUP), 2018, ⟨10.1093/bioinformatics/bty382⟩
Accession number :
edsair.doi.dedup.....f172c12b8549e3dd00b57c91435c38c7
Full Text :
https://doi.org/10.1093/bioinformatics/bty382⟩