Back to Search Start Over

Unsupervised extremely randomized trees

Authors :
Malika Smaïl-Tabbone
Miguel Couceiro
Kevin Dalleau
Knowledge representation, reasonning (ORPAILLEUR)
Inria Nancy - Grand Est
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD)
Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA)
Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Computational Algorithms for Protein Structures and Interactions (CAPSID)
Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS)
Source :
PAKDD 2018-The 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2018-The 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, May 2018, Melbourne, Australia, Advances in Knowledge Discovery and Data Mining ISBN: 9783319930398, PAKDD (3)
Publication Year :
2018
Publisher :
HAL CCSD, 2018.

Abstract

International audience; In this paper we present a method to compute dissimilarities on unlabeled data, based on extremely randomized trees. This method, Unsupervised Extremely Randomized Trees, is used jointly with a novel randomized labeling scheme we describe here, and that we call AddCl3. Unlike existing methods such as AddCl1 and AddCl2, no synthetic instances are generated, thus avoiding an increase in the size of the dataset. The empirical study of this method shows that Unsupervised Extremely Randomized Trees with AddCl3 provides competitive results regarding the quality of resulting clusterings, while clearly outperforming previous similar methods in terms of running time.

Details

Language :
English
ISBN :
978-3-319-93039-8
ISBNs :
9783319930398
Database :
OpenAIRE
Journal :
PAKDD 2018-The 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2018-The 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining, May 2018, Melbourne, Australia, Advances in Knowledge Discovery and Data Mining ISBN: 9783319930398, PAKDD (3)
Accession number :
edsair.doi.dedup.....a379d68056b2aab68002658b54223163