Author: "Versley, Yannick" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Versley, Yannick"' showing total 117 results

Start Over Author "Versley, Yannick"

117 results on '"Versley, Yannick"'

1. LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging

Author: Rosenbaum, Andy, Soltan, Saleh, Hamza, Wael, Versley, Yannick, and Boese, Markus
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt. In a 10-shot novel intent setting for the SNIPS dataset, LINGUIST surpasses state-of-the-art approaches (Back-Translation and Example Extrapolation) by a wide margin, showing absolute improvement for the target intents of +1.9 points on IC Recall and +2.5 points on ST F1 Score. In the zero-shot cross-lingual setting of the mATIS++ dataset, LINGUIST out-performs a strong baseline of Machine Translation with Slot Alignment by +4.14 points absolute on ST F1 Score across 6 languages, while matching performance on IC. Finally, we verify our results on an internal large-scale multilingual dataset for conversational agent IC+ST and show significant improvements over a baseline which uses Back-Translation, Paraphrasing and Slot Catalog Resampling. To our knowledge, we are the first to demonstrate instruction fine-tuning of a large-scale seq2seq model to control the outputs of multilingual intent- and slot-labeled data generation., Comment: Accepted to The 29th International Conference on Computational Linguistics (COLING 2022) October 12-17, 2022, Gyeongju, Republic of Korea https://coling2022.org/
Published: 2022

2. Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction

Author: Guillou, Liane, Hardmeier, Christian, Nakov, Preslav, Stymne, Sara, Tiedemann, Jörg, Versley, Yannick, Cettolo, Mauro, Webber, Bonnie, and Popescu-Belis, Andrei
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Information Retrieval, 68T50, I.2.7
Abstract: We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemmatised and PoS-tagged form. We provided four subtasks, for the English-French and English-German language pairs, in both directions. Eleven teams participated in the shared task; nine for the English-French subtask, five for French-English, nine for English-German, and six for German-English. Most of the submissions outperformed two strong language-model based baseline systems, with systems using deep recurrent neural networks outperforming those using other architectures for most language pairs., Comment: cross-lingual pronoun prediction, WMT, shared task, English, German, French
Published: 2019

3. Incorporating Semi-supervised Features into Discontinuous Easy-First Constituent Parsing

Author: Versley, Yannick
Subjects: Computer Science - Computation and Language
Abstract: This paper describes adaptations for EaFi, a parser for easy-first parsing of discontinuous constituents, to adapt it to multiple languages as well as make use of the unlabeled data that was provided as part of the SPMRL shared task 2014.
Published: 2014

4. Challenges and Directions of Further Research

Author: Poesio, Massimo, Stuckardt, Roland, Versley, Yannick, Hirschberg, Julia, Editor-in-chief, Hovy, Eduard, Editor-in-chief, Johnson, Mark, Editor-in-chief, Poesio, Massimo, editor, Stuckardt, Roland, editor, and Versley, Yannick, editor
Published: 2016
Full Text: View/download PDF

5. Using Lexical and Encyclopedic Knowledge

Author: Versley, Yannick, Poesio, Massimo, Ponzetto, Simone, Hirschberg, Julia, Editor-in-chief, Hovy, Eduard, Editor-in-chief, Johnson, Mark, Editor-in-chief, Poesio, Massimo, editor, Stuckardt, Roland, editor, and Versley, Yannick, editor
Published: 2016
Full Text: View/download PDF

6. Off-the-Shelf Tools

Author: Versley, Yannick, Björkelund, Anders, Hirschberg, Julia, Editor-in-chief, Hovy, Eduard, Editor-in-chief, Johnson, Mark, Editor-in-chief, Poesio, Massimo, editor, Stuckardt, Roland, editor, and Versley, Yannick, editor
Published: 2016
Full Text: View/download PDF

7. Early Approaches to Anaphora Resolution: Theoretically Inspired and Heuristic-Based

Author: Poesio, Massimo, Stuckardt, Roland, Versley, Yannick, Vieira, Renata, Hirschberg, Julia, Editor-in-chief, Hovy, Eduard, Editor-in-chief, Johnson, Mark, Editor-in-chief, Poesio, Massimo, editor, Stuckardt, Roland, editor, and Versley, Yannick, editor
Published: 2016
Full Text: View/download PDF

8. Annotated Corpora and Annotation Tools

Author: Poesio, Massimo, Pradhan, Sameer, Recasens, Marta, Rodriguez, Kepa, Versley, Yannick, Hirschberg, Julia, Editor-in-chief, Hovy, Eduard, Editor-in-chief, Johnson, Mark, Editor-in-chief, Poesio, Massimo, editor, Stuckardt, Roland, editor, and Versley, Yannick, editor
Published: 2016
Full Text: View/download PDF

9. Using Lexical and Encyclopedic Knowledge

Author: Versley, Yannick, primary, Poesio, Massimo, additional, and Ponzetto, Simone, additional
Published: 2016
Full Text: View/download PDF

10. Annotated Corpora and Annotation Tools

Author: Poesio, Massimo, primary, Pradhan, Sameer, additional, Recasens, Marta, additional, Rodriguez, Kepa, additional, and Versley, Yannick, additional
Published: 2016
Full Text: View/download PDF

11. Early Approaches to Anaphora Resolution: Theoretically Inspired and Heuristic-Based

Author: Poesio, Massimo, primary, Stuckardt, Roland, additional, Versley, Yannick, additional, and Vieira, Renata, additional
Published: 2016
Full Text: View/download PDF

12. Challenges and Directions of Further Research

Author: Poesio, Massimo, primary, Stuckardt, Roland, additional, and Versley, Yannick, additional
Published: 2016
Full Text: View/download PDF

13. Off-the-Shelf Tools

Author: Versley, Yannick, primary and Björkelund, Anders, additional
Published: 2016
Full Text: View/download PDF

14. Vagueness and Referential Ambiguity in a Large-Scale Annotated Corpus

Author: Versley, Yannick
Published: 2008
Full Text: View/download PDF

15. Continuous Model Improvement for Language Understanding with Machine Translation

Author: Abujabal, Abdalghani, primary, Delli Bovi, Claudio, additional, Ryu, Sungho, additional, Gojayev, Turan, additional, Triefenbach, Fabian, additional, and Versley, Yannick, additional
Published: 2021
Full Text: View/download PDF

16. Findings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction

Author: Loáiciga, Sharid, Stymne, Sara, Nakov, Preslav, Hardmeier, Christian, Tiedemann, Jörg, Cettolo, Mauro, Versley, Yannick, Loáiciga, Sharid, Stymne, Sara, Nakov, Preslav, Hardmeier, Christian, Tiedemann, Jörg, Cettolo, Mauro, and Versley, Yannick
Abstract: We describe the design, the setup, and the evaluation results of the DiscoMT 2017 shared task on cross-lingual pronoun prediction. The task asked participants to predict a target-language pronoun given a source-language pronoun in the context of a sentence. We further provided a lemmatized target-language human-authored translation of the source sentence, and automatic word alignments between the source sentence words and the target-language lemmata. The aim of the task was to predict, for each target-language pronoun placeholder, the word that should replace it from a small, closed set of classes, using any type of information that can be extracted from the entire document. We offered four subtasks, each for a different language pair and translation direction: English-to-French, English-to-German, German-to-English, and Spanish-to-English. Five teams participated in the shared task, making submissions for all language pairs. The evaluation results show that all participating teams outperformed two strong n-gram-based language model-based baseline systems by a sizable margin.
Published: 2017

17. Constituent order in German multiple questions: Normal order and (apparent) anti-superiority effects

Author: Fanselow, Gisbert, Häussler, Jana, Weskott, Thomas, Featherston, Sam, and Versley, Yannick
Published: 2016

18. On the Limits of Non-Parallelism in ATB Movement: Experimental Evidence for Strict Syntactic Identity

Author: Hartmann, Jutta, Konietzko, Andreas, Salzmann, Martin, Featherston, Sam, and Versley, Yannick
Published: 2016

19. Findings of the 2017 DiscoMT Shared Task on Cross-lingual Pronoun Prediction

Author: Loáiciga, Sharid, primary, Stymne, Sara, additional, Nakov, Preslav, additional, Hardmeier, Christian, additional, Tiedemann, Jörg, additional, Cettolo, Mauro, additional, and Versley, Yannick, additional
Published: 2017
Full Text: View/download PDF

20. Findings of the 2016 WMT Shared Taskon Cross-lingual Pronoun Prediction

Author: Guillou, Liane, Hardmeier, Christian, Nakov, Preslav, Stymne, Sara, Tiedemann, Jörg, Versley, Yannick, Cettolo, Mauro, Webber, Bonnie, Popescu-Belis, Andrei, Guillou, Liane, Hardmeier, Christian, Nakov, Preslav, Stymne, Sara, Tiedemann, Jörg, Versley, Yannick, Cettolo, Mauro, Webber, Bonnie, and Popescu-Belis, Andrei
Abstract: We describe the design, the evaluation setup, and the results of the 2016 WMT shared task on cross-lingual pronoun prediction. This is a classification task in which participants are asked to provide predictions on what pronoun class label should replace a placeholder value in the target-language text, provided in lemmatised and PoS-tagged form. We provided four subtasks, for the English–French and English–German language pairs, in both directions. Eleven teams participated in the shared task; nine for the English–French subtask, five for French–English, nine for English–German, and six for German–English. Most of the submissions outperformed two strong language-model-based baseline systems, with systems using deep recurrent neural networks outperforming those using other architectures for most language pairs.
Published: 2016

21. Implicit Semantic Roles in a Multilingual Setting

Author: Sikos, Jennifer, primary, Versley, Yannick, additional, and Frank, Anette, additional
Published: 2016
Full Text: View/download PDF

22. Findings of the 2016 WMT Shared Task on Cross-lingual Pronoun Prediction

Author: Guillou, Liane, primary, Hardmeier, Christian, additional, Nakov, Preslav, additional, Stymne, Sara, additional, Tiedemann, Jörg, additional, Versley, Yannick, additional, Cettolo, Mauro, additional, Webber, Bonnie, additional, and Popescu-Belis, Andrei, additional
Published: 2016
Full Text: View/download PDF

23. ICL-HD at SemEval-2016 Task 8: Meaning Representation Parsing - Augmenting AMR Parsing with a Preposition Semantic Role Labeling Neural Network

Author: Brandt, Lauritz, primary, Grimm, David, additional, Zhou, Mengfei, additional, and Versley, Yannick, additional
Published: 2016
Full Text: View/download PDF

24. ICL-HD at SemEval-2016 Task 10: Improving the Detection of Minimal Semantic Units and their Meanings with an Ontology and Word Embeddings

Author: Kirilin, Angelika, primary, Krauss, Felix, additional, and Versley, Yannick, additional
Published: 2016
Full Text: View/download PDF

25. Discontinuity Re-visited: A Minimalist Approach to Pseudoprojective Constituent Parsing

Author: Versley, Yannick, primary
Published: 2016
Full Text: View/download PDF

26. Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation

Author: Hardmeier, Christian, Nakov, Preslav, Stymne, Sara, Tiedemann, Jörg, Versley, Yannick, Cettolo, Mauro, Hardmeier, Christian, Nakov, Preslav, Stymne, Sara, Tiedemann, Jörg, Versley, Yannick, and Cettolo, Mauro
Abstract: We describe the design, the evaluation setup, and the results of the DiscoMT 2015 shared task, which included two subtasks, relevant to both the machine translation (MT) and the discourse communities: (i) pronoun-focused translation, a practical MT task, and (ii) cross-lingual pronoun prediction, a classification task that requires no specific MT expertise and is interesting as a machine learning task in its own right. We focused on the English–French language pair, for which MT output is generally of high quality, but has visible issues with pronoun translation due to differences in the pronoun systems of the two languages. Six groups participated in the pronoun-focused translation task and eight groups in the cross-lingual pronoun prediction task.
Published: 2015

27. Overview of the SPMRL 2013 Shared Task: A Cross-Framework Evaluation of Parsing Morphologically Rich Languages

Author: Seddah, Djamé, Tsarfaty, Reut, Kübler, Sandra, Candito, Marie, Choi, Jinho D., Farkas, Richárd, Foster, Jennifer, Goenaga, Iakes, Gojenola Galletebeitia, Koldo, Goldberg, Yoav, Green, Spence, Habash, Nizar, Kuhlmann, Marco, Maier, Wolfgang, Nivre, Joakim, Przepiórkowski, Adam, Roth, Ryan, Seeker, Wolfgang, Versley, Yannick, Vincze, Veronika, Wolińsk, Marcin, Wróblewska, Alina, Villemonte de La Clergerie, Éric, Analyse Linguistique Profonde à Grande Echelle, Large-scale deep linguistic processing (ALPAGE), Inria Paris-Rocquencourt, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université Paris Diderot - Paris 7 (UPD7), Faculty of Mathematics and Computer Science, Weizmann Institute of Science [Rehovot, Israël], Computational Linguistics, Department of Linguistics (DL), Indiana University [Bloomington], Indiana University System-Indiana University System-Indiana University [Bloomington], Indiana University System-Indiana University System, IPSOFT (Ipsoft), University of Massachusetts [Amherst] (UMass Amherst), University of Massachusetts System (UMASS)-University of Massachusetts System (UMASS), Department of Computer Algorithms and Artificial Intelligence., University of Szeged [Szeged], National Centre for Language Technology (NCLT), Dublin City University [Dublin] (DCU), Computer Languages and Systems, University of the Basque Country [Bizkaia] (UPV/EHU), Department of Computer Science, Bar-Ilan University [Israël], Stanford NLP Group, Stanford University, Center for Computational Learning Systems (CCLS), Columbia University [New York], Computational Linguistics and Language Technology, Uppsala University, Abteilung für Computerlinguistik [Düsseldorf ], Philosophische Fakultät [Düsseldorf], Heinrich Heine Universität Düsseldorf = Heinrich Heine University [Düsseldorf]-Heinrich Heine Universität Düsseldorf = Heinrich Heine University [Düsseldorf]-Instituts für Sprache und Information [Düsseldorf], Heinrich Heine Universität Düsseldorf = Heinrich Heine University [Düsseldorf], Institute of Computer Science [Warszawa], Polska Akademia Nauk = Polish Academy of Sciences (PAN), Institut für Maschinelle Sprachverarbeitung, Universität Stuttgart [Stuttgart], Institute for Computational Linguistics, Universität Heidelberg [Heidelberg], Université Paris Diderot - Paris 7 (UPD7)-Inria Paris-Rocquencourt, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), University of the Basque Country/Euskal Herriko Unibertsitatea (UPV/EHU), and Universität Heidelberg [Heidelberg] = Heidelberg University
Subjects: langues à morphologie riche, evaluation, parsing en dépendance, Statistical Parsing, Morphologically-rich languages, parsing en constituents, non gold scenarios, Parsing statistiques, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], dependency parsing, TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, analyse syntaxique à large couverture, phrase based parsing, treebanking, [SHS.LANGUE]Humanities and Social Sciences/Linguistics, non gold tokenization
Abstract: International audience; This paper reports on the first shared task on statistical parsing of morphologically rich lan- guages (MRLs). The task features data sets from nine languages, each available both in constituency and dependency annotation. We report on the preparation of the data sets, on the proposed parsing scenarios, and on the eval- uation metrics for parsing MRLs given dif- ferent representation types. We present and analyze parsing results obtained by the task participants, and then provide an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios.
Published: 2013

28. A Syntax-first Approach to High-quality Morphological Analysis and Lemma Disambiguation for the TüBa-D/Z Treebank

Author: Versley, Yannick, Beck, Kathrin, Hinrichs, Erhard, and Telljohann, Heike
Abstract: Proceedings of the Ninth International Workshop on Treebanks and Linguistic Theories. Editors: Markus Dickinson, Kaili Müürisep and Marco Passarotti. NEALT Proceedings Series, Vol. 9 (2010), 233-244. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15891 .
Published: 2010

29. Subsentential Sentiment on a Shoestring: A Crosslingual Analysis of Compositional Classification

Author: Haas, Michael, primary and Versley, Yannick, additional
Published: 2015
Full Text: View/download PDF

30. Pronoun-Focused MT and Cross-Lingual Pronoun Prediction: Findings of the 2015 DiscoMT Shared Task on Pronoun Translation

Author: Hardmeier, Christian, primary, Nakov, Preslav, additional, Stymne, Sara, additional, Tiedemann, Jörg, additional, Versley, Yannick, additional, and Cettolo, Mauro, additional
Published: 2015
Full Text: View/download PDF

31. Statistical Parsing of Morphologically Rich Languages (SPMRL) What, How and Whither

Author: Tsarfaty, Reut, Seddah, Djamé, Goldberg, Yoav, Kübler, Sandra, Candito, Marie, Foster, Jennifer, Versley, Yannick, Rehbein, Ines, Tounsi, Lamia, Uppsala University, Analyse Linguistique Profonde à Grande Echelle, Large-scale deep linguistic processing (ALPAGE), Inria Paris-Rocquencourt, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université Paris Diderot - Paris 7 (UPD7), Ben-Gurion University of the Negev (BGU), Indian Institute of Science [Bangalore] (IISc Bangalore), Emmy Noether Project (SFB 833), Eberhard Karls Universität Tübingen = Eberhard Karls University of Tuebingen, National Centre for Language Technology (NCLT), Dublin City University [Dublin] (DCU), Allgemeine Linguistik Computational Linguistics and phonetics (Allgemeine Linguistik), and Saarland University [Saarbrücken]
Subjects: hindi, german, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, arabic, hebrew, french, Computational linguistics, Linguistics, statistical parsing, basque, morphologically rich languages, ComputingMilieux_MISCELLANEOUS, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Abstract: The term Morphologically Rich Languages (MRLs) refers to languages in which significant information concerning syntactic units and relations is expressed at word-level. There is ample evidence that the application of readily available statistical parsing models to such languages is susceptible to serious performance degradation. The first workshop on statistical parsing of MRLs hosts a variety of contributions which show that despite language-specific idiosyncrasies, the problems associated with parsing MRLs cut across languages and parsing frameworks. In this paper we review the current state-of-affairs with respect to parsing MRLs and point out central challenges. We synthesize the contributions of researchers working on parsing Arabic, Basque, French, German, Hebrew, Hindi and Korean to point out shared solutions across languages. The overarching analysis suggests itself as a source of directions for future investigations.
Published: 2010

32. Discovery of ambiguous and unambiguous discourse connectives via annotation projection

Author: Versley, Yannick
Abstract: Proceedings of the Workshop on Annotation and Exploitation of Parallel Corpora AEPC 2010. Editors: Lars Ahrenberg, Jörg Tiedemann and Martin Volk. NEALT Proceedings Series, Vol. 10 (2010), 83-92. © 2010 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/15893 .
Published: 2010
Full Text: View/download PDF

33. Auflösung koreferenten Bridgings in deutschen Zeitungstexten

Author: Versley, Yannick and Hinrichs, Erhard (Prof. Dr.)
Subjects: Computerlinguistik , Maschinelles Lernen , Texttechnologie , Referenzidentität , Syntaktische Analyse, Computational linguistics , Discourse , Definite descriptions , Coreference , German , Parsing
Abstract: This thesis is concerned with techniques to improve coreference resolution for German newspaper text - grouping noun phrases of a text with respect to their intended reference, specifically the non-pronoun part of the problem including definite descriptions (common noun phrases with a definite article) and names. It is concerned especially with techniques to establish plausible coreference relations between mentions with different heads using indicators of semantic similarity, the preprocessing steps necessary for these indicators, and useful techniques for establishing the anaphoricity or non-anaphoricity of definite descriptions. Diese Dissertation befasst sich mit Techniken, die geeignet sind, Koreferenzauflösung in deutschen Zeitungstexten - das Gruppieren von Nominalphrasen innerhalb eines Textes entsprechend deren intendierter Referenz - zu verbessen, genauer das bislang weniger untersuchte Teilgebiet der Nicht-Pronomen, nämlich der definiten Beschreibungen (Nominalphrasen, die ein Appelativum als lexikalischen Kopf und einen definiten Artikel aufweisen) und Eigennamen. Im Mittelpunkt stehen korpusbasierte Techniken, die plausible Koreferenzbeziehungen zwischen Nennungen mit verschiedenen lexikalischen Köpfen anhand von Indikatoren semantischer Ähnlichkeit etablieren, die für diese Indikatoren notwendigen Vorverarbeitungsschritte, und Methoden zur unterscheidung von anaphorischen und nicht-anaphorischen definiten Beschreibungen.
Published: 2010

34. Decorrelation and shallow semantic patterns for distributional clustering of nouns and verbs

Author: Versley, Yannick
Subjects: ddc:400
Abstract: Distributional approximations to lexical semantics are very useful not only in helping the creation of lexical semantic resources (Kilgariff et al., 2004; Snow et al., 2006), but also when directly applied in tasks that can benefit from large-coverage semantic knowledge such as coreference resolution (Poesio et al., 1998; Gasperin and Vieira, 2004; Versley, 2007), word sense disambiguation (Mc- Carthy et al., 2004) or semantical role labeling (Gordon and Swanson, 2007). We present a model that is built from Webbased corpora using both shallow patterns for grammatical and semantic relations and a window-based approach, using singular value decomposition to decorrelate the feature space which is otherwise too heavily influenced by the skewed topic distribution of Web corpora.
Published: 2009

35. How to compare treebanks

Author: Kübler, Sandra, Maier, Wolfgang, Rehbein, Ines, and Versley, Yannick
Subjects: ddc:400
Abstract: Recent years have seen an increasing interest in developing standards for linguistic annotation, with a focus on the interoperability of the resources. This effort, however, requires a profound knowledge of the advantages and disadvantages of linguistic annotation schemes in order to avoid importing the flaws and weaknesses of existing encoding schemes into the new standards. This paper addresses the question how to compare syntactically annotated corpora and gain insights into the usefulness of specific design decisions. We present an exhaustive evaluation of two German treebanks with crucially different encoding schemes. We evaluate three different parsers trained on the two treebanks and compare results using EVALB, the Leaf-Ancestor metric, and a dependency-based evaluation. Furthermore, we present TePaCoC, a new testsuite for the evaluation of parsers on complex German grammatical constructions. The testsuite provides a well thought-out error classification, which enables us to compare parser output for parsers trained on treebanks with different encoding schemes and provides interesting insights into the impact of treebank annotation schemes on specific constructions like PP attachment or non-constituent coordination.
Published: 2008

36. Using the web to resolve coreferent bridging in German newspaper text

Author: Versley, Yannick
Subjects: ddc:400
Abstract: We adopt Markert and Nissim (2005)’s approach of using the World Wide Web to resolve cases of coreferent bridging for German and discuss the strength and weaknesses of this approach. As the general approach of using surface patterns to get information on ontological relations between lexical items has only been tried on English, it is also interesting to see whether the approach works for German as well as it does for English and what differences between these languages need to be accounted for. We also present a novel approach for combining several patterns that yields an ensemble that outperforms the best-performing single patterns in terms of both precision and recall.
Published: 2007

37. Antecedent selection techniques for high-recall roreference resolution

Author: Versley, Yannick
Subjects: ddc:400
Abstract: We investigate methods to improve the recall in coreference resolution by also trying to resolve those definite descriptions where no earlier mention of the referent shares the same lexical head (coreferent bridging). The problem, which is notably harder than identifying coreference relations among mentions which have the same lexical head, has been tackled with several rather different approaches, and we attempt to provide a meaningful classification along with a quantitative comparison. Based on the different merits of the methods, we discuss possibilities to improve them and show how they can be effectively combined.
Published: 2007

38. From surface dependencies towards deeper semantic representations [Semantic representations]

Author: Versley, Yannick and Zinsmeister, Heike
Subjects: ddc:400
Abstract: In the past, a divide could be seen between ’deep’ parsers on the one hand, which construct a semantic representation out of their input, but usually have significant coverage problems, and more robust parsers on the other hand, which are usually based on a (statistical) model derived from a treebank and have larger coverage, but leave the problem of semantic interpretation to the user. More recently, approaches have emerged that combine the robustness of datadriven (statistical) models with more detailed linguistic interpretation such that the output could be used for deeper semantic analysis. Cahill et al. (2002) use a PCFG-based parsing model in combination with a set of principles and heuristics to derive functional (f-)structures of Lexical-Functional Grammar (LFG). They show that the derived functional structures have a better quality than those generated by a parser based on a state-of-the-art hand-crafted LFG grammar. Advocates of Dependency Grammar usually point out that dependencies already are a semantically meaningful representation (cf. Menzel, 2003). However, parsers based on dependency grammar normally create underspecified representations with respect to certain phenomena such as coordination, apposition and control structures. In these areas they are too "shallow" to be directly used for semantic interpretation. In this paper, we adopt a similar approach to Cahill et al. (2002) using a dependency-based analysis to derive functional structure, and demonstrate the feasibility of this approach using German data. A major focus of our discussion is on the treatment of coordination and other potentially underspecified structures of the dependency data input. F-structure is one of the two core levels of syntactic representation in LFG (Bresnan, 2001). Independently of surface order, it encodes abstract syntactic functions that constitute predicate argument structure and other dependency relations such as subject, predicate, adjunct, but also further semantic information such as the semantic type of an adjunct (e.g. directional). Normally f-structure is captured as a recursive attribute value matrix, which is isomorphic to a directed graph representation. Figure 5 depicts an example target f-structure. As mentioned earlier, these deeper-level dependency relations can be used to construct logical forms as in the approaches of van Genabith and Crouch (1996), who construct underspecified discourse representations (UDRSs), and Spreyer and Frank (2005), who have robust minimal recursion semantics (RMRS) as their target representation. We therefore think that f-structures are a suitable target representation for automatic syntactic analysis in a larger pipeline of mapping text to interpretation. In this paper, we report on the conversion from dependency structures to fstructure. Firstly, we evaluate the f-structure conversion in isolation, starting from hand-corrected dependencies based on the TüBa-D/Z treebank and Versley (2005)´s conversion. Secondly, we start from tokenized text to evaluate the combined process of automatic parsing (using Foth and Menzel (2006)´s parser) and f-structure conversion. As a test set, we randomly selected 100 sentences from TüBa-D/Z which we annotated using a scheme very close to that of the TiGer Dependency Bank (Forst et al., 2004). In the next section, we sketch dependency analysis, the underlying theory of our input representations, and introduce four different representations of coordination. We also describe Weighted Constraint Dependency Grammar (WCDG), the dependency parsing formalism that we use in our experiments. Section 3 characterises the conversion of dependencies to f-structures. Our evaluation is presented in section 4, and finally, section 5 summarises our results and gives an overview of problems remaining to be solved.
Published: 2006

39. A constraint-based approach to noun phrase coreference resolution in German newspaper text

Author: Versley, Yannick
Subjects: Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing), ddc:400
Abstract: In this paper, we investigate the usefulness of a wide range of features for their usefulness in the resolution of nominal coreference, both as hard constraints (i.e. completely removing elements from the list of possible candidates) as well as soft constraints (where a cumulation of violations of soft constraints will make it less likely that a candidate is chosen as the antecedent). We present a state of the art system based on such constraints and weights estimated with a maximum entropy model, using lexical information to resolve cases of coreferent bridging.
Published: 2006

40. Disagreement dissected : vagueness as a source of ambiguity in nominal (co-)reference

Author: Versley, Yannick
Subjects: ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, ddc:400, ComputingMethodologies_ARTIFICIALINTELLIGENCE
Abstract: Using a qualitative analysis of disagreements from a referentially annotated newspaper corpus, we show that, in coreference annotation, vague referents are prone to greater disagreement. We show how potentially problematic cases can be dealt with in a way that is practical even for larger-scale annotation, considering a real-world example from newspaper text.
Published: 2006

41. Parser evaluation across text types

Author: Versley, Yannick
Subjects: TheoryofComputation_MATHEMATICALLOGICANDFORMALLANGUAGES, ddc:400
Abstract: When a statistical parser is trained on one treebank, one usually tests it on another portion of the same treebank, partly due to the fact that a comparable annotation format is needed for testing. But the user of a parser may not be interested in parsing sentences from the same newspaper all over, or even wants syntactic annotations for a slightly different text type. Gildea (2001) for instance found that a parser trained on the WSJ portion of the Penn Treebank performs less well on the Brown corpus (the subset that is available in the PTB bracketing format) than a parser that has been trained only on the Brown corpus, although the latter one has only half as many sentences as the former. Additionally, a parser trained on both the WSJ and Brown corpora performs less well on the Brown corpus than on the WSJ one. This leads us to the following questions that we would like to address in this paper: - Is there a difference in usefulness of techniques that are used to improve parser performance between the same-corpus and the different-corpus case? - Are different types of parsers (rule-based and statistical) equally sensitive to corpus variation? To achieve this, we compared the quality of the parses of a hand-crafted constraint-based parser and a statistical PCFG-based parser that was trained on a treebank of German newspaper text.
Published: 2005

42. Extracting spatial information : grounding, classifying and linking spatial expressions

Author: Schilder, Frank, Versley, Yannick, and Habel, Christopher
Subjects: ddc:400
Abstract: This paper is concerned with the tagging of spatial expressions in German newspaper articles, assigning a meaning to the expression and classifying the usages of the spatial expression and linking the derived referent to an event description. In our system, we implemented the activation of concepts in a very simple fashion, a concept is activated once (with a cost depending on the item that activated it) and is left activated thereafter. As an example, a city also activates the nodes for the region and the country it is part of, so that cities from one country are chosen over cities from different countries. A test corpus of 12 German newspaper articles was tested regarding several disambiguation strategies. Disambiguation was carried out via a beam search to find an approximately cost-optimal solution for the conflict set of potential grounding candidates for the tagged spatial expression. Test showed that the disambiguation strategies improved accuracy significantly.
Published: 2004

43. Overview of the SPMRL 2013 shared task: cross-framework evaluation of parsing morphologically rich languages

Author: Seddah, Djamé, Tsarfaty, Reut, Kübler, Sandra, Candito, Marie, Choi, Jinho, Farkas, Richard, Foster, Jennifer, Goenaga, Iakes, Gojenola, Koldo, Goldberg, Yoav, Green, Spence, Habash, Nizar, Kuhlmann, Marco, Maier, Wolfgang, Nivre, Joakim, Przepiórkowski, Adam, Roth, Ryan, Seeker, Wolfgang, Versley, Yannick, Vincze, Veronika, Wolinski, Marcin, Wróblewska, Alina, Villemonte de la Clérgerie, Eric, Seddah, Djamé, Tsarfaty, Reut, Kübler, Sandra, Candito, Marie, Choi, Jinho, Farkas, Richard, Foster, Jennifer, Goenaga, Iakes, Gojenola, Koldo, Goldberg, Yoav, Green, Spence, Habash, Nizar, Kuhlmann, Marco, Maier, Wolfgang, Nivre, Joakim, Przepiórkowski, Adam, Roth, Ryan, Seeker, Wolfgang, Versley, Yannick, Vincze, Veronika, Wolinski, Marcin, Wróblewska, Alina, and Villemonte de la Clérgerie, Eric
Abstract: This paper reports on the first shared task on statistical parsing of morphologically rich languages (MRLs). The task features data sets from nine languages, each available both in constituency and dependency annotation. We report on the preparation of the data sets, on the proposed parsing scenarios, and on the evaluation metrics for parsing MRLs given different representation types. We present and analyze parsing results obtained by the task participants, and then provide an analysis and comparison of the parsers across languages and frameworks, reported for gold input as well as more realistic parsing scenarios.
Published: 2013

44. Vagueness and referential ambiguity in a large-scale annotated corpus

Author: Versley, Yannick and Versley, Yannick
Abstract: In this paper, we argue that difficulties in the definition of coreference itself contribute to lower inter-annotator agreement in certain cases. Data from a large referentially annotated corpus serves to corroborate this point, using a quantitative investigation to assess which effects or problems are likely to be the most prominent. Several examples where such problems occur are discussed in more detail, and we then propose a generalisation of Poesio, Reyle and Stevenson’s Justified Sloppiness Hypothesis to provide a unified model for these cases of disagreement and argue that a deeper understanding of the phenomena involved allows to tackle problematic cases in a more principled fashion than would be possible using only pre-theoretic intuitions.
Published: 2009

45. Tagging kausaler Relationen

Author: Versley, Yannick and Versley, Yannick
Abstract: In dieser Diplomarbeit geht es um kausale Beziehungen zwischen Ereignissen und Erklärungsbeziehungen zwischen Ereignissen, bei denen kausale Relationen eine wichtige Rolle spielen. Nachdem zeitliche Relationen einerseits ihrer einfacheren Formalisierbarkeit und andererseits ihrer gut sichtbaren Rolle in der Grammatik (Tempus und Aspekt, zeitliche Konjunktionen) wegen in jüngerer Zeit stärker im Mittelpunkt des Interesses standen, soll hier argumentiert werden, dass kausale Beziehungen und die Erklärungen, die sie ermöglichen, eine wichtigere Rolle im Kohärenzgefüge des Textes spielen. Im Gegensatz zu “tiefen” Verfahren, die auf einer detaillierten semantischen Repr¨asentation des Textes aufsetzen und infolgedessen für unrestringierten Text m. E. nicht geeignet sind, wird hier untersucht, wie man dieses Ziel erreichen kann, ohne sich auf eine aufwändig konstruierte Wissensbasis verlassen zu müssen., Causal relations between events and explanational relations among these events, where the causal relations play an important role, are the main topic of the present diploma thesis. After temporal relations between events have been more in the focus of interest recently because of both being easier to formalize and playing a visible role in grammar (notably the effects of time and aspect, as well as temporal conjunctions), I will argue that causal relations and the explanations they provide play the greater role in the coherence of a text. In contrast to “deep” approaches that rely on a fine-grained semantic representation of the text and by consequent can be unsuitable for unrestricted text, I will investigate how to reach this goal without requiring an expensive hand-coded knowledge base.
Published: 2008

46. Linguistic Tests for Discourse Relations

Author: Versley, Yannick, primary and Gastel, Anna, additional
Published: 2013
Full Text: View/download PDF

47. STTS als Part-of-Speech-Tagset in Tübinger Baumbanken

Author: Telljohann, Heike, primary, Versley, Yannick, additional, Beck, Kathrin, additional, Hinrichs, Erhard, additional, and Zastrow, Thomas, additional
Published: 2013
Full Text: View/download PDF

48. Scalable discriminative parsing for German

Author: Versley, Yannick, primary and Rehbein, Ines, additional
Published: 2009
Full Text: View/download PDF

49. BART

Author: Versley, Yannick, primary, Ponzetto, Simone Paolo, additional, Poesio, Massimo, additional, Eidelman, Vladimir, additional, Jern, Alan, additional, Smith, Jason, additional, Yang, Xiaofeng, additional, and Moschitti, Alessandro, additional
Published: 2008
Full Text: View/download PDF

50. Coreference systems based on kernels methods

Author: Versley, Yannick, primary, Moschitti, Alessandro, additional, Poesio, Massimo, additional, and Yang, Xiaofeng, additional
Published: 2008
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

117 results on '"Versley, Yannick"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources