1. Ontologies and Information Extraction
- Author
-
Nédellec, Claire, Nazarenko, Adeline, Unité Mathématique Informatique et Génome (MIG), Institut National de la Recherche Agronomique (INRA), Université Paris Nord (Paris 13), Centre National de la Recherche Scientifique (CNRS), and ProdInra, Migration
- Subjects
FOS: Computer and information sciences ,Computer Science - Artificial Intelligence ,relation extraction ,[SDV]Life Sciences [q-bio] ,named entity recognition ,règles d'extraction ,[MATH] Mathematics [math] ,[INFO] Computer Science [cs] ,traitement du langage naturel ,Computer Science - Information Retrieval ,[SDV] Life Sciences [q-bio] ,extraction d'information ,extraction rules ,machine learning ,Artificial Intelligence (cs.AI) ,[INFO]Computer Science [cs] ,apprentissage machine ,reconnaissance d'entités nommées ,information extraction ,natural language processing ,[MATH]Mathematics [math] ,H.3.1 ,Information Retrieval (cs.IR) - Abstract
Ce document aurait dû paraître dans l'ouvrage "Handbook and Ontologies" 2004; International audience; An ontology is a description of conceptual knowledge organized in a computer-based representation while information extraction (IE) is a method for analyzing texts expressing facts in natural language and extracting relevant pieces of information from these texts. IE and ontologies are involved in two main and related tasks, • Ontology is used for Information Extraction: IE needs ontologies as part of the understanding process for extracting the relevant information; • Information Extraction is used for populating and enhancing the ontology: texts are useful sources of knowledge to design and enrich ontologies. These two tasks are combined in a cyclic process: ontologies are used for inter- preting the text at the right level for IE to be efficient and IE extracts new knowl- edge from the text, to be integrated in the ontology. We will argue that even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. We will show that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. Extracting information from texts calls for lexical knowledge, grammars describing the specific syntax of the texts to be analyzed, as well as semantic and ontological knowledge. In this chapter, we will not take part in the debate about the limit between lexicon and ontology as a conceptual model. We will rather focus on the role that ontologies viewed as semantic knowledge bases could play in IE. The ontologies that can be used for and enriched by IE relate conceptual knowl- edge to its linguistic realizations (e.g. a concept must be associated with the terms that express it, eventually in various languages). Interpreting text factual information also calls for knowledge on the domain referential entities that we consider as part of the ontology (Sect. 2.2.1).
- Published
- 2006
- Full Text
- View/download PDF