Back to Search Start Over

Band-based similarity indices for gene expression classification and clustering

Authors :
Aurora Torrente
Comunidad de Madrid
Ministerio de Ciencia, Innovación y Universidades (España)
Source :
Scientific Reports, Vol 11, Iss 1, Pp 1-18 (2021), e-Archivo. Repositorio Institucional de la Universidad Carlos III de Madrid, instname, Scientific Reports
Publication Year :
2021
Publisher :
Nature Portfolio, 2021.

Abstract

The concept of depth induces an ordering from centre outwards in multivariate data. Most depth definitions are unfeasible for dimensions larger than three or four, but the Modified Band Depth (MBD) is a notable exception that has proven to be a valuable tool in the analysis of high-dimensional gene expression data. This depth definition relates the centrality of each individual to its (partial) inclusion in all possible bands formed by elements of the data set. We assess (dis)similarity between pairs of observations by accounting for such bands and constructing binary matrices associated to each pair. From these, contingency tables are calculated and used to derive standard similarity indices. Our approach is computationally efficient and can be applied to bands formed by any number of observations from the data set. We have evaluated the performance of several band-based similarity indices with respect to that of other classical distances in standard classification and clustering tasks in a variety of simulated and real data sets. However, the use of the method is not restricted to these, the extension to other similarity coefficients being straightforward. Our experiments show the benefits of our technique, with some of the selected indices outperforming, among others, the Euclidean distance. This work has been financially supported by the FEDER/ Ministerio de Ciencia, Innovación y Universidades- Agencia Estatal de Investigación, Grant Numbers FIS2017-84440-C2-2-P and MTM2017-84446-C2-2-R, and by the Madrid Government (Comunidad de Madrid-Spain) under the Multiannual Agreement with UC3M in the line of Excellence of University Professors (EPUC3M23), and in the context of the V PRICIT (Regional Programme of Research and Technological Innovation). Publicado

Details

Language :
English
ISSN :
20452322
Volume :
11
Issue :
1
Database :
OpenAIRE
Journal :
Scientific Reports
Accession number :
edsair.doi.dedup.....67d34cebd017ad660fc0b3d88b91465e