Back to Search
Start Over
Accurate and Efficient Estimation of Small P-values with the Cross-Entropy Method: Applications in Genomic Data Analysis
- Source :
- Bioinformatics
- Publication Year :
- 2018
- Publisher :
- arXiv, 2018.
-
Abstract
- Small $p$-values are often required to be accurately estimated in large scale genomic studies for the adjustment of multiple hypothesis tests and the ranking of genomic features based on their statistical significance. For those complicated test statistics whose cumulative distribution functions are analytically intractable, existing methods usually do not work well with small $p$-values due to lack of accuracy or computational restrictions. We propose a general approach for accurately and efficiently calculating small $p$-values for a broad range of complicated test statistics based on the principle of the cross-entropy method and Markov chain Monte Carlo sampling techniques. We evaluate the performance of the proposed algorithm through simulations and demonstrate its application to three real examples in genomic studies. The results show that our approach can accurately evaluate small to extremely small $p$-values (e.g. $10^{-6}$ to $10^{-100}$). The proposed algorithm is helpful to the improvement of existing test procedures and the development of new test procedures in genomic studies.<br />Comment: 34 pages, 1 figure, 4 tables
- Subjects :
- Statistics and Probability
Data Analysis
FOS: Computer and information sciences
Computer science
Entropy
computer.software_genre
Biochemistry
Statistics - Applications
03 medical and health sciences
Statistical significance
Range (statistics)
Entropy (information theory)
Applications (stat.AP)
Molecular Biology
030304 developmental biology
Statistical hypothesis testing
Estimation
0303 health sciences
Genome
Markov chain
Cumulative distribution function
030302 biochemistry & molecular biology
Cross-entropy method
Genomics
Original Papers
Markov Chains
Computer Science Applications
Computational Mathematics
Computational Theory and Mathematics
Ranking
Data mining
computer
Algorithms
Subjects
Details
- Database :
- OpenAIRE
- Journal :
- Bioinformatics
- Accession number :
- edsair.doi.dedup.....c7807fb2977ab358b25f81d6cb6c02d6
- Full Text :
- https://doi.org/10.48550/arxiv.1803.03373