Author: "Kevin Dalleau" / Topic: [info.info-lg]computer science [cs]/machine learning [cs.lg] - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Kevin Dalleau"' showing total 4 results

Start Over Author "Kevin Dalleau" Topic [info.info-lg]computer science [cs]/machine learning [cs.lg]

4 results on '"Kevin Dalleau"'

1. Unsupervised Extra Trees: a stochastic approach to compute similarities in heterogeneous data

Author: Kevin Dalleau, Miguel Couceiro, Malika Smaïl-Tabbone, Computational Algorithms for Protein Structures and Interactions (CAPSID), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Knowledge representation, reasonning (ORPAILLEUR), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Kevin Dalleau’s PhD is funded by the RHU FIGHTHF (ANR-15-RHUS-0004) and the Region Grand Est (France)., ANR-15-RHUS-0004,FIGHT-HF,Combattre l'insuffisance cardiaque(2015), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), and Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)
Subjects: 0301 basic medicine, Hierarchical agglomerative clustering, Unsupervised classification, Computer science, Decision tree, Monotonic function, Similarity measure, Clustering, 03 medical and health sciences, 0302 clinical medicine, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Robustness (computer science), Cluster (physics), Preprocessor, Cluster analysis, business.industry, Applied Mathematics, Pattern recognition, Computer Science Applications, 030104 developmental biology, Computational Theory and Mathematics, 030220 oncology & carcinogenesis, Modeling and Simulation, Artificial intelligence, Extremely randomized trees, business, Information Systems
Abstract: International audience; In this paper we present a method to compute similarities on unlabeled data, based on extremely randomized trees. The main idea of our method, Unsu-pervised Extremely Randomized Trees (UET) is to randomly split the data in an iterative fashion until a stopping criterion is met, and to compute a similarity based on the co-occurrence of samples in the leaves of each generated tree. Using a tree-based approach to compute similarities is interesting, as the inherent We evaluate our method on synthetic and real-world datasets by comparing the mean similarities between samples with the same label and the mean similarities between samples with different labels. These metrics are similar to intracluster and intercluster similarities, and are used to assess the computed similarities instead of a clustering algorithm's results. Our empirical study shows that the method effectively gives distinct similarity values between samples belonging to different clusters, and gives indiscernible values when there is no cluster structure. We also assess some interesting properties such as in-variance under monotone transformations of variables and robustness to correlated variables and noise. Finally , we performed hierarchical agglomerative clustering on synthetic and real-world homogeneous and heterogeneous datasets using UET versus standard similarity measures. Our experiments show that the algorithm outperforms existing methods in some cases, and can reduce the amount of preprocessing needed with many real-world datasets.
Published: 2020
Full Text: View/download PDF

2. Unsupervised extremely randomized trees

Author: Malika Smaïl-Tabbone, Miguel Couceiro, Kevin Dalleau, Knowledge representation, reasonning (ORPAILLEUR), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Computational Algorithms for Protein Structures and Interactions (CAPSID), and Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Complex Systems, Artificial Intelligence & Robotics (LORIA - AIS)
Subjects: Scheme (programming language), Computer science, unsupervised classification, media_common.quotation_subject, Decision tree, 02 engineering and technology, Similarity measure, Machine learning, computer.software_genre, 01 natural sciences, Clustering, 010104 statistics & probability, Empirical research, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], decision tree, 0202 electrical engineering, electronic engineering, information engineering, Quality (business), 0101 mathematics, Cluster analysis, distance, computer.programming_language, media_common, business.industry, ACM: I.: Computing Methodologies/I.5: PATTERN RECOGNITION/I.5.3: Clustering, Running time, similarity measure, ComputingMethodologies_PATTERNRECOGNITION, extremely randomized trees, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, ACM: I.: Computing Methodologies/I.2: ARTIFICIAL INTELLIGENCE/I.2.6: Learning
Abstract: International audience; In this paper we present a method to compute dissimilarities on unlabeled data, based on extremely randomized trees. This method, Unsupervised Extremely Randomized Trees, is used jointly with a novel randomized labeling scheme we describe here, and that we call AddCl3. Unlike existing methods such as AddCl1 and AddCl2, no synthetic instances are generated, thus avoiding an increase in the size of the dataset. The empirical study of this method shows that Unsupervised Extremely Randomized Trees with AddCl3 provides competitive results regarding the quality of resulting clusterings, while clearly outperforming previous similar methods in terms of running time.
Published: 2018

3. Learning from biomedical linked data to suggest valid pharmacogenes

Author: Adrien Coulet, Kevin Dalleau, Patrice Ringot, Sébastien Da Silva, Ndeye Coumba Ndiaye, Yassine Marzougui, Knowledge representation, reasonning (ORPAILLEUR), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Ecole Nationale Supérieure des Mines de Nancy (ENSMN), Institut Mines-Télécom [Paris] (IMT)-Université de Lorraine (UL), Interactions Gène-Environnement en Physiopathologie Cardio-Vasculaire (IGE-PCV), Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), ANR PractiKPharma project, grant ANR-15-CE23-0028, funded by the French National Research Agency (http://practikpharma.loria.fr/) and *Snowflake, an Inria associate team (http://snowflake.loria.fr/), Snowflake Inria Associate Team, Inria@SiliconValley, Snowball Inria Associate Team, ANR-15-CE23-0028,PractiKPharma,Confrontation entre connaissances de l'état de l'art et connaissances extraites de dossiers patients en pharmacogénomique(2015), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Coulet, Adrien, and Interactions humain-machine, objets connectés, contenus numériques, données massives et connaissance - Confrontation entre connaissances de l'état de l'art et connaissances extraites de dossiers patients en pharmacogénomique - - PractiKPharma2015 - ANR-15-CE23-0028 - AAPG2015 - VALID
Subjects: 0301 basic medicine, Graph kernel, Computer science, Knowledge discovery from databases, computer.software_genre, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], 0302 clinical medicine, False positive paradox, [INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB], [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM], Valid pharmacogenes, Linked data, Computer Science Applications, Random forest, Identification (information), Phenotype, [SDV.SP.PHARMA] Life Sciences [q-bio]/Pharmaceutical sciences/Pharmacology, 030220 oncology & carcinogenesis, lcsh:R858-859.7, Information Systems, [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], Computer Networks and Communications, Health Informatics, [SDV.GEN.GH] Life Sciences [q-bio]/Genetics/Human genetics, lcsh:Computer applications to medicine. Medical informatics, Machine learning, Set (abstract data type), 03 medical and health sciences, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Computer Graphics, Selection (linguistics), Data mining, Semantic Web, Linkage (software), [INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB], business.industry, Research, Computational Biology, [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], 030104 developmental biology, [SDV.GEN.GH]Life Sciences [q-bio]/Genetics/Human genetics, Pharmacogenetics, [SDV.SP.PHARMA]Life Sciences [q-bio]/Pharmaceutical sciences/Pharmacology, Artificial intelligence, [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], business, Pharmacogenomics, computer
Abstract: Background A standard task in pharmacogenomics research is identifying genes that may be involved in drug response variability, i.e., pharmacogenes. Because genomic experiments tended to generate many false positives, computational approaches based on the use of background knowledge have been proposed. Until now, only molecular networks or the biomedical literature were used, whereas many other resources are available. Method We propose here to consume a diverse and larger set of resources using linked data related either to genes, drugs or diseases. One of the advantages of linked data is that they are built on a standard framework that facilitates the joint use of various sources, and thus facilitates considering features of various origins. We propose a selection and linkage of data sources relevant to pharmacogenomics, including for example DisGeNET and Clinvar. We use machine learning to identify and prioritize pharmacogenes that are the most probably valid, considering the selected linked data. This identification relies on the classification of gene–drug pairs as either pharmacogenomically associated or not and was experimented with two machine learning methods –random forest and graph kernel–, which results are compared in this article. Results We assembled a set of linked data relative to pharmacogenomics, of 2,610,793 triples, coming from six distinct resources. Learning from these data, random forest enables identifying valid pharmacogenes with a F-measure of 0.73, on a 10 folds cross-validation, whereas graph kernel achieves a F-measure of 0.81. A list of top candidates proposed by both approaches is provided and their obtention is discussed. Electronic supplementary material The online version of this article (doi:10.1186/s13326-017-0125-1) contains supplementary material, which is available to authorized users.
Published: 2017
Full Text: View/download PDF

4. Suggesting valid pharmacogenes by mining linked data

Author: Kevin Dalleau, Ndeye Coumba Ndiaye, Adrien Coulet, Knowledge representation, reasonning (ORPAILLEUR), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Natural Language Processing & Knowledge Discovery (LORIA - NLPKD), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Institut National de Recherche en Informatique et en Automatique (Inria)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL), Interactions Gène-Environnement en Physiopathologie Cardio-Vasculaire (IGE-PCV), Institut National de la Santé et de la Recherche Médicale (INSERM)-Université de Lorraine (UL), ANR-15-CE23-0028,PractiKPharma,Confrontation entre connaissances de l'état de l'art et connaissances extraites de dossiers patients en pharmacogénomique(2015), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS), Coulet, Adrien, and Interactions humain-machine, objets connectés, contenus numériques, données massives et connaissance - Confrontation entre connaissances de l'état de l'art et connaissances extraites de dossiers patients en pharmacogénomique - - PractiKPharma2015 - ANR-15-CE23-0028 - AAPG2015 - VALID
Subjects: [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], pharmacogenomics, [INFO.INFO-DB]Computer Science [cs]/Databases [cs.DB], Link prediction, [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Linked Data, pharmacogenes, [INFO.INFO-DB] Computer Science [cs]/Databases [cs.DB], [INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM], Data mining, Linked Open Data, random forest, [INFO.INFO-BI] Computer Science [cs]/Bioinformatics [q-bio.QM]
Abstract: International audience; A standard task in pharmacogenomics research is identifying genes that may be involved in drug response variability, i.e., pharmacogenes. Because genomic experiments tended to generate many false positives, computational approaches based on the use of background knowledge have been proposed. Until now, those have used only molecular networks or the biomedical literature. Here we propose a novel method that consumes an eclectic set of linked data sources to help validating uncertain drug–gene relationships. One of the advantages relies on that linked data are implemented in a standard framework that facilitates the joint use of various sources, making easy the consideration of features of various origins. Consequently, we propose an initial selection of linked data sources relevant to pharmacogenomics. We formatted these data to train a random forest algorithm , producing a model that enables classifying drug–gene pairs as related or not, thus confirming the validity of candidate pharmacogenes. Our model achieve the performance of F-measure=0.92, on a 100 folds cross-validation. A list of top candidates is provided and their obtention is discussed.
Published: 2015

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

4 results on '"Kevin Dalleau"'

1. Unsupervised Extra Trees: a stochastic approach to compute similarities in heterogeneous data

2. Unsupervised extremely randomized trees

3. Learning from biomedical linked data to suggest valid pharmacogenes

4. Suggesting valid pharmacogenes by mining linked data

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

4 results on '"Kevin Dalleau"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources