Descriptor: "Traitement du texte et du document" / Topic: business.industry - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Traitement du texte et du document"' showing total 5 results

Start Over Descriptor "Traitement du texte et du document" Topic business.industry

5 results on '"Traitement du texte et du document"'

1. aMV-LSTM

Author: Mohand Boughanem, Taoufiq Dkaki, Jose G. Moreno, Thiziri Belkacem, Institut National Polytechnique de Toulouse - INPT (FRANCE), Centre National de la Recherche Scientifique - CNRS (FRANCE), Université Toulouse III - Paul Sabatier - UT3 (FRANCE), Université Toulouse - Jean Jaurès - UT2J (FRANCE), Université Toulouse 1 Capitole - UT1 (FRANCE), Recherche d’Information et Synthèse d’Information (IRIT-IRIS), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Université Toulouse - Jean Jaurès (UT2J), Université Toulouse III - Paul Sabatier (UT3), and Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE)
Subjects: Matching (statistics), Process (engineering), Computer science, Attention models, Text matching, 02 engineering and technology, computer.software_genre, Position (vector), 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Layer (object-oriented design), Traitement du texte et du document, Positional, business.industry, 020207 software engineering, Text representation, Weighting, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, Identification (information), Artificial intelligence, business, computer, Feature learning, Word (computer architecture), Natural language processing
Abstract: National audience; Deep models are getting a wide interest in recent NLP and IR state-of-the-art. Among the proposed models, position-based models and attention-based models take into account the word position in the text, in the former, and the importance of a word among other words in the latter. The positional information are some of the important features that help text representation learning. However, the importance of a given word among others in a given text, which is an important aspect in text matching, is not considered in positional features. In this paper, we propose a model that combines position-based representation learning approach with the attention-based weighting process. The latter learns an importance coefficient for each word of the input text. We propose an extension of a position-based model MV-LSTM with an attention layer, allowing a parameterizable architecture. We believe that when the model is aware of both word position and importance, the learned representations will get more relevant features for the matching process. Our model, namely aMV-LSTM, learns the attention based coefficients to weight words of the different input sentences, before computing their position-based representations. Experimental results, in question/answer matching and question pairs identification tasks, show that the proposed model outperforms the MV-LSTM baseline and several state-of-the-art models.
Published: 2019
Full Text: View/download PDF

2. Industrial Requirements Classification for Redundancy and Inconsistency Detection in SEMIOS

Author: Manel Mezghani, Florence Sèdes, Juyeon Kang, Centre National de la Recherche Scientifique - CNRS (FRANCE), Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE), Université Toulouse III - Paul Sabatier - UT3 (FRANCE), Université Toulouse - Jean Jaurès - UT2J (FRANCE), Université Toulouse 1 Capitole - UT1 (FRANCE), Semios For Requirements (FRANCE), Semios For Requirements (Toulouse, France), Systèmes d’Informations Généralisées (IRIT-SIG), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, and Institut National Polytechnique de Toulouse - INPT (FRANCE)
Subjects: Computer science, Technical documents, 02 engineering and technology, Inconsistency, computer.software_genre, NLP, Clustering, Redundancy, Software, Formal specification, 0202 electrical engineering, electronic engineering, information engineering, Redundancy (engineering), Cluster analysis, Traitement du texte et du document, Requirements engineering, business.industry, k-means clustering, 020207 software engineering, Technical documentation, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, Systems analysis, 020201 artificial intelligence & image processing, Data mining, business, computer
Abstract: International audience; Requirements are usually "hand-written" and suffers from several problems like redundancy and inconsistency. The problems of redundancy and inconsistency between requirements or sets of requirements impact negatively the success of final products. Manually processing these issues requires too much time and it is very costly. The main contribution of this paper is the use of k-means algorithm for a redundancy and inconsistency detection in a new context, which is Requirements Engineering context. Also, we introduce a filtering approach to eliminate "noisy" requirements and a preprocessing step based on the Natural Language Processing (NLP) technique to see the impact of this latter on the k-means results. We use Part-Of-Speech (POS) tagging and noun chunking to detect technical business terms associated to the requirements documents that we analyze. We experiment this approach on real industrial datasets. The results show the efficiency of the k-means clustering algorithm, especially with the filtering and preprocessing steps. Our approach is using the software SEMIOS and will be integrated as a new functionality.
Published: 2018
Full Text: View/download PDF

3. An Information Nutritional Label for Online Documents

Author: Anastasia Giachanou, Iryna Gurevych, Yiqun Liu, Gregory Grefenstette, Norbert Fuhr, Benno Stein, Kalervo Järvelin, Andreas Hanselowski, Rosie Jones, Wolfgang Nejdl, Josiane Mothe, Isabella Peters, Centre National de la Recherche Scientifique - CNRS (FRANCE), Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE), Université Toulouse III - Paul Sabatier - UT3 (FRANCE), Université Toulouse - Jean Jaurès - UT2J (FRANCE), Université Toulouse 1 Capitole - UT1 (FRANCE), Universitätsklinikum Essen [Universität Duisburg-Essen] (Uniklinik Essen), Università della Svizzera italiana = University of Italian Switzerland (USI), Florida Institute for Human and Machine Cognition [Pensacola] (IHMC), Technische Universität Darmstadt (TU Darmstadt), Tampere University of Technology [Tampere] (TUT), Institute for Advanced Study [Tsinghua], Tsinghua University [Beijing] (THU), Systèmes d’Informations Généralisées (IRIT-SIG), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Leibniz Universität Hannover [Hannover] (LUH), Kiel University, and Bauhaus-Universität Weimar
Subjects: Computer science, media_common.quotation_subject, Credibility, Internet privacy, 02 engineering and technology, Readability, Management Information Systems, Presentation, 020204 information systems, Document analysis, 0202 electrical engineering, electronic engineering, information engineering, Openness to experience, Information society, media_common, Traitement du texte et du document, Assessability, Authority, business.industry, Natural language processing, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, Fake news, Hardware and Architecture, Beauty, 020201 artificial intelligence & image processing, business
Abstract: International audience; With the proliferation of online information sources, it has become more and more difficult to judge the trustworthiness of news found on the Web. The beauty of the web is its openness, but this openness has lead to a proliferation of false and unreliable information, whose presentation makes it difficult to detect. It may be impossible to detect what is “real news” and what is “fake news” since this discussion ultimately leads to a deep philosophical discussion of what is true and what is false. However, recent advances in natural language processing allow us to analyze information objectively according to certain objective criteria (for example, the number of spelling errors). Here we propose creating an “information nutrition label” that we can automatically generated for any online text. Among others, the label provides information on the following computable criteria: factuality, virality, opinion, controversy, authority, technicality, and topicality. With this label, we hope to help readers make more informed judgments about the items they read.
Published: 2017
Full Text: View/download PDF

4. A two-tiered propositional framework for handling multi source inconsistent information

Author: Didier Dubois, Davide Ciucci, Dipartimento di Informatica Sistemistica e Comunicazione (DISCo), Università degli Studi di Milano-Bicocca [Milano] (UNIMIB), Argumentation, Décision, Raisonnement, Incertitude et Apprentissage (IRIT-ADRIA), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique - CNRS (FRANCE), Institut National Polytechnique de Toulouse - INPT (FRANCE), Université Toulouse III - Paul Sabatier - UT3 (FRANCE), Université Toulouse - Jean Jaurès - UT2J (FRANCE), Université Toulouse 1 Capitole - UT1 (FRANCE), Università degli Studi di Milano-Bicocca - BICOCCA (ITALY), Institut de Recherche en Informatique de Toulouse - IRIT (Toulouse, France), Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE), Antonucci, A, Cholvy, L, Papini, O, Ciucci, D, and Dubois, D
Subjects: Theoretical computer science, Zeroth-order logic, Framework, 02 engineering and technology, 0603 philosophy, ethics and religion, Machine learning, computer.software_genre, Belnap logic, Epistemic modal logic, Description logic, 0202 electrical engineering, electronic engineering, information engineering, Propositional information, Boolean capacities, Autoepistemic logic, Mathematics, Propositional variable, Logique en informatique, Traitement du texte et du document, business.industry, Multimodal logic, Modal logic, [INFO.INFO-LO]Computer Science [cs]/Logic in Computer Science [cs.LO], 06 humanities and the arts, BC-logic, Two-tiered propositional logic, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, 060302 philosophy, Dynamic logic (modal logic), 020201 artificial intelligence & image processing, Artificial intelligence, Epistemic logic, multi-source information, business, computer
Abstract: National audience; This paper proposes a conceptually simple but expressive framework for handling propositional information stemming from several sources, namely a two-tiered propositional logic augmented with classical modal axioms (BC-logic), a fragment of the non-normal modal logic EMN, whose semantics is expressed in terms of two-valued monotonic set-functions called Boolean capacities. We present a theorem-preserving translation of Belnap logic in this setting. As special cases, we can recover previous translations of three-valued logics such as Kleene and Priest logics. Our translation bridges the gap between Belnap logic, epistemic logic, and theories of uncertainty like possibility theory or belief functions, and paves the way to a unified approach to various inconsistency handling methods.
Published: 2017
Full Text: View/download PDF

5. Mapping different rhetorical relation annotations: A proposal

Author: Maite Taboada, Farah Benamara, Centre National de la Recherche Scientifique - CNRS (FRANCE), Institut National Polytechnique de Toulouse - Toulouse INP (FRANCE), Université Toulouse III - Paul Sabatier - UT3 (FRANCE), Université Toulouse - Jean Jaurès - UT2J (FRANCE), Université Toulouse 1 Capitole - UT1 (FRANCE), Simon Fraser University (CANADA), MEthodes et ingénierie des Langues, des Ontologies et du DIscours (IRIT-MELODI), Institut de recherche en informatique de Toulouse (IRIT), Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées-Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse - Jean Jaurès (UT2J)-Université Toulouse III - Paul Sabatier (UT3), Université Fédérale Toulouse Midi-Pyrénées-Centre National de la Recherche Scientifique (CNRS)-Institut National Polytechnique (Toulouse) (Toulouse INP), Université Fédérale Toulouse Midi-Pyrénées-Université Toulouse 1 Capitole (UT1), Université Fédérale Toulouse Midi-Pyrénées, Université Toulouse III - Paul Sabatier (UT3), Simon Fraser University (SFU.ca), and Institut National Polytechnique de Toulouse - INPT (FRANCE)
Subjects: Traitement du texte et du document, Computer science, business.industry, Informatique et langage, computer.software_genre, Multiple languages, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, Annotation, Taxonomy (general), Annotations, Rhetorical question, Artificial intelligence, business, computer, Natural language processing, Discourse relations
Abstract: International audience; Annotation efforts have resulted in the availability of a number of corpora with rhetorical relation information. The corpora, unfortunately, are annotated under different theoretical approaches and have different hierarchie of relations. In addition, new sets of rhetorical relations have been proposed to accounfor language variation. The types of relations, however, tend to overlap or be related in specific ways. We believe that differences across approaches are minimal, and a unified set of relations that works across languages is possible. This paper details a new taxonomy of relations organized in four top level-classes with a total of 26 relations. We propose a mapping between existing annotations and show that our taxonomy is robust across theories, and can be applied to multiple languages.
Published: 2016

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

5 results on '"Traitement du texte et du document"'

1. aMV-LSTM

2. Industrial Requirements Classification for Redundancy and Inconsistency Detection in SEMIOS

3. An Information Nutritional Label for Online Documents

4. A two-tiered propositional framework for handling multi source inconsistent information

5. Mapping different rhetorical relation annotations: A proposal

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

5 results on '"Traitement du texte et du document"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources