Back to Search Start Over

Unsupervised Learning Informational Limit in Case of Sparsely Described Examples.

Authors :
Bock, H. -H.
Gaul, W.
Vichi, M.
Arabie, Ph.
Baier, D.
Critchley, F.
Decker, R.
Diday, E.
Greenacre, M.
Lauro, C.
Meulman, J.
Monari, P.
Nishisato, S.
Ohsumi, N.
Opitz, O.
Ritter, G.
Schader, M.
Weihs, C.
Brito, Paula
Cucumel, Guy
Source :
Selected Contributions in Data Analysis & Classification; 2007, p345-355, 11p
Publication Year :
2007

Abstract

This paper presents a model characterizing unsupervised learning from an information theoretic point of view. Under some hypothesis, it defines a theoretical quality criterion, which corresponds to the informational limit that bounds the learning ability of any clustering algorithm. This quality criterion depends on the information content of the learning set. It is relevant when examples are sparsely described, i.e. when most of the descriptors are missing. This theoretical limit of any unsupervised learning algorithm is then compared to the actual learning quality of different clustering algorithms (EM, COBWEB and PRESS). This empirical comparison is based on the use of artificial data sets, which are randomly degraded. Finally, the paper shows that the results of PRESS, an algorithm specifically designed to learn from sparsely described examples, are very closed to the theoretical upper bound quality. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISBNs :
9783540735588
Database :
Supplemental Index
Journal :
Selected Contributions in Data Analysis & Classification
Publication Type :
Book
Accession number :
33315448
Full Text :
https://doi.org/10.1007/978-3-540-73560-1_32