Back to Search Start Over

Spiked proteomic standard dataset for testing label-free quantitative software and statistical methods

Authors :
Marlène Marcellin
David Bouyssié
Jérôme Garin
Christine Carapito
Odile Burlet-Schiltz
Anne Gonzalez de Peredo
Anne-Marie Hesse
Karima Chaoui
Agnès Hovasse
Alain Van Dorssaeler
Christophe Bruley
Emmanuelle Mouton-Barbosa
Yohann Couté
Myriam Ferro
Christine Schaeffer
Sarah Cianférani
Claire Ramus
Sebastian Vaca
Institut de pharmacologie et de biologie structurale (IPBS)
Centre National de la Recherche Scientifique (CNRS)-Université Toulouse III - Paul Sabatier (UT3)
Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées
Laboratoire de Spectrométrie de Masse BioOrganique [Strasbourg] (LSMBO)
Département Sciences Analytiques et Interactions Ioniques et Biomoléculaires (DSA-IPHC)
Institut Pluridisciplinaire Hubert Curien (IPHC)
Université de Strasbourg (UNISTRA)-Institut National de Physique Nucléaire et de Physique des Particules du CNRS (IN2P3)-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Institut National de Physique Nucléaire et de Physique des Particules du CNRS (IN2P3)-Centre National de la Recherche Scientifique (CNRS)-Institut Pluridisciplinaire Hubert Curien (IPHC)
Université de Strasbourg (UNISTRA)-Institut National de Physique Nucléaire et de Physique des Particules du CNRS (IN2P3)-Centre National de la Recherche Scientifique (CNRS)-Université de Strasbourg (UNISTRA)-Institut National de Physique Nucléaire et de Physique des Particules du CNRS (IN2P3)-Centre National de la Recherche Scientifique (CNRS)
Source :
Data in Brief, Data in Brief, Elsevier, 2016, 6, pp.286-294. ⟨10.1016/j.dib.2015.11.063⟩, Data in Brief, Vol 6, Iss, Pp 286-294 (2016)
Publication Year :
2015

Abstract

International audience; This data article describes a controlled, spiked proteomic dataset for which the “ground truth” of variant proteins is known. It is based on the LC-MS analysis of samples composed of a fixed background of yeast lysate and different spiked amounts of the UPS1 mixture of 48 recombinant proteins. It can be used to objectively evaluate bioinformatic pipelines for label-free quantitative analysis, and their ability to detect variant proteins with good sensitivity and low false discovery rate in large-scale proteomic studies. More specifically, it can be useful for tuning software tools parameters, but also testing new algorithms for label-free quantitative analysis, or for evaluation of downstream statistical methods. The raw MS files can be downloaded from ProteomeXchange with identifier http://www.ebi.ac.uk/pride/archive/projects/PXD001819. Starting from some raw files of this dataset, we also provide here some processed data obtained through various bioinformatics tools (including MaxQuant, Skyline, MFPaQ, IRMa-hEIDI and Scaffold) in different workflows, to exemplify the use of such data in the context of software benchmarking, as discussed in details in the accompanying manuscript [1]. The experimental design used here for data processing takes advantage of the different spike levels introduced in the samples composing the dataset, and processed data are merged in a single file to facilitate the evaluation and illustration of software tools results for the detection of variant proteins with different absolute expression levels and fold change values.

Details

ISSN :
23523409
Volume :
6
Database :
OpenAIRE
Journal :
Data in brief
Accession number :
edsair.doi.dedup.....e5401e07d0c8b69889bf871c8b94567d
Full Text :
https://doi.org/10.1016/j.dib.2015.11.063⟩