Back to Search Start Over

A model selection approach for multiple sequence segmentation and dimensionality reduction.

Authors :
Castro, Bruno M.
Lemes, Renan B.
Cesar, Jonatas
Hünemeier, Tábita
Leonardi, Florencia
Source :
Journal of Multivariate Analysis. Sep2018, Vol. 167, p319-330. 12p.
Publication Year :
2018

Abstract

In this paper we consider the problem of segmenting n aligned random sequences of equal length m into a finite number of independent blocks. We propose a penalized maximum likelihood criterion to infer simultaneously the number of points of independence as well as the position of each point. We show how to compute exactly the estimator by means of a dynamic programming algorithm with time complexity O ( m 2 n ) . We also propose another method, called hierarchical algorithm, that provides an approximation to the estimator when the sample size increases and runs in time O { m ln ( m ) n } . Our main theoretical results are the strong consistency of both estimators when the sample size n grows to infinity. We illustrate the convergence of these algorithms through some simulation examples and we apply the method to identify recombination hotspots in real SNPs data. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
0047259X
Volume :
167
Database :
Academic Search Index
Journal :
Journal of Multivariate Analysis
Publication Type :
Academic Journal
Accession number :
130839186
Full Text :
https://doi.org/10.1016/j.jmva.2018.05.006