1. kruX: matrix-based non-parametric eQTL discovery
- Author
-
Hassan Foroughi Asl, Johan Björkegren, Jianlong Qi, and Tom Michoel
- Subjects
Genotype ,Computer science ,Quantitative Trait Loci ,eQTL ,Polymorphism, Single Nucleotide ,Matrix algebra ,Biochemistry ,Statistics, Nonparametric ,Structural Biology ,Genetic model ,Test statistic ,Humans ,Non-parametric methods ,Quantitative Biology - Genomics ,Molecular Biology ,Statistical hypothesis testing ,Parametric statistics ,Genomics (q-bio.GN) ,Genome ,business.industry ,Applied Mathematics ,Nonparametric statistics ,Linear model ,Computational Biology ,Reproducibility of Results ,Pattern recognition ,Computer Science Applications ,FOS: Biological sciences ,Expression quantitative trait loci ,Outlier ,Artificial intelligence ,business ,Algorithms ,Software - Abstract
The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA and linear model methods. We found that the Kruskal-Wallis test is more robust against data outliers and heterogeneous genotype group sizes and detects a higher proportion of non-linear associations, but is more conservative for calling additive linear associations. In summary, kruX enables the use of robust non-parametric methods for massive eQTL mapping without the need for a high-performance computing infrastructure., minor revision; 6 pages, 5 figures; software available at http://krux.googlecode.com
- Published
- 2014
- Full Text
- View/download PDF