Back to Search Start Over

Differential analysis in Transcriptomic The strength of randomly picking 'reference' genes

Authors :
Dorota Desaulle
Céline Hoffmann
Bernard Hainque
Yves Rozenholc
bigey, pascal
Laboratoire de biomathématiques, EA 7537 [Paris] (BioSTM)
Université de Paris - UFR Pharmacie [Santé] (UP UFR Pharmacie)
Université de Paris (UP)-Université de Paris (UP)
Unité de Technologies Chimiques et Biologiques pour la Santé (UTCBS - UM 4 (UMR 8258 / U1022))
Institut de Chimie du CNRS (INC)-Université de Paris (UP)-Centre National de la Recherche Scientifique (CNRS)-Institut National de la Santé et de la Recherche Médicale (INSERM)
Université Paris Cité - UFR Pharmacie [Santé] (UPCité UFR Pharmacie)
Université Paris Cité (UPCité)-Université Paris Cité (UPCité)
Institut National de la Santé et de la Recherche Médicale (INSERM)-Institut de Chimie du CNRS (INC)-Centre National de la Recherche Scientifique (CNRS)-Université Paris Cité (UPCité)
Source :
HAL

Abstract

Transcriptomic analysis are characterized by being not directly quantitative and only providing relative measurements of expression levels up to an unknown individual scaling factor. This difficulty is enhanced for differential expression analysis. Several methods have been proposed to circumvent this lack of knowledge by estimating the unknown individual scaling factors however, even the most used one, are suffering from being built on hardly justifiable biological hypotheses or from having weak statistical background. Only two methods withstand this analysis: one based on largest connected graph component hardly usable for large amount of expressions like in NGS, the second based on $\log$-linear fits which unfortunately require a first step which uses one of the methods described before. We introduce a new procedure for differential analysis in the context of transcriptomic data. It is the result of pooling together several differential analyses each based on randomly picked genes used as reference genes. It provides a differential analysis free from the estimation of the individual scaling factors or any other knowledge. Theoretical properties are investigated both in term of FWER and power. Moreover in the context of Poisson or negative binomial modelization of the transcriptomic expressions, we derived a test with non asymptotic control of its bounds. We complete our study by some empirical simulations and apply our procedure to a real data set of hepatic miRNA expressions from a mouse model of non-alcoholic steatohepatitis (NASH), the CDAHFD model. This study on real data provides new hits with good biological explanations.<br />30 pages, 2 figures

Details

Database :
OpenAIRE
Journal :
HAL
Accession number :
edsair.doi.dedup.....89329aafba703366def3f20a9e67683b