Author: "KELLEHER, JOHN" / Topic: computational linguistics - Searchworks@Jio Institute Digital Library Search Results

1. Missing information, unresponsive authors, experimental flaws: the impossibility of assessing the reproducibility of previous human evaluations in NLP

Author: Belz, Anya, Thomson, Craig, Reiter, Ehud, Abercrombie, Gavin, Alonso-Moral, Jose M., Arvan, Mohammad, Cheung, Jackie, Cieliebak, Mark, Clark, Elizabeth, van Deemter, Kees, Kelleher, John D., and Klubička, Filip
Subjects: Computational linguistics
Abstract: We report our efforts in identifying a set of previous humane valuations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible. We present our results and findings, which include that just13% of papers had (i) sufficiently low barriers to reproduction, and (ii) enough obtainable information, to be considered for reproduction, and that all but one of the experiments we selected for reproduction was discovered to have flaws that made the meaningfulness of conducting are production questionable. As a result, we had to change our coordinated study design from a reproduce approach to a standardise-then-reproduce-twice approach. Our overall (negative)finding that the great majority of human evaluations in NLP is not repeatable and/or not reproducible and/or too flawed to justify reproduction, paints a dire picture, but presents an opportunity for a rethink about how to design and report human evaluations in NLP
Published: 2023

2. English WordNet Taxonomic Random Walk Pseudo-Corpora

Author: Klubicka, Filip, Maldonado, Alfredo, Mahalunkar, Abhijit, Kelleher, John D., ADAPT Centre for Dig- ital Content Technology, and SFI Research Centres Programme
Subjects: random walk, Computational Linguistics, taxonomy, ComputingMethodologies_PATTERNRECOGNITION, WordNet, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, language resource, ComputingMethodologies_ARTIFICIALINTELLIGENCE, pseudo-corpus, semantic relationship
Abstract: This is a resource description paper that describes the creation and properties of a set of pseudo-corpora generated artificially from a random walk over the English WordNet taxonomy. Our WordNet taxonomic random walk implementation allows the exploration of different random walk hyperparameters and the generation of a variety of different pseudo-corpora. We find that different combinations of the walk’s hyperparameters result in varying statistical properties of the generated pseudo-corpora. We have published a total of 81 pseudo-corpora that we have used in our previous research, but have not exhausted all possible combinations of hyperparameters, which is why we have also published a codebase that allows the generation of additional WordNet taxonomic pseudo-corpora as needed. Ultimately, such pseudo-corpora can be used to train taxonomic word embeddings, as a way of transferring taxonomic knowledge into a word embedding space.
Published: 2020
Full Text: View/download PDF

3. Synthetic, Yet Natural: Properties of WordNet Random Walk Corpora and the impact of rare words on embedding performance

Author: Klubicka, Filip, Maldonado, Alfredo, Mahalunkar, Abhijit, Kelleher, John D., ADAPT Centre for Digital Content Technology, and SFI Research Centres Pro-gramme
Subjects: word embeddings, evaluation, Artificial Intelligence and Robotics, WordNet, InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL, representations, Software Engineering, corpus, ComputingMethodologies_ARTIFICIALINTELLIGENCE, random walk, Computational Linguistics, taxonomy, word similarity, Numerical Analysis and Scientific Computing
Abstract: Creating word embeddings that reflect semantic relationships encoded in lexical knowledge resources is an open challenge. One approach is to use a random walk over a knowledge graph to generate a pseudo-corpus and use this corpus to train embeddings. However, the effect of the shape of the knowledge graph on the generated pseudo-corpora, and on the resulting word embeddings, has not been studied. To explore this, we use English WordNet, constrained to the taxonomic (tree-like) portion of the graph, as a case study. We investigate the properties of the generated pseudo-corpora, and their impact on the resulting embeddings. We find that the distributions in the psuedo-corpora exhibit properties found in natural corpora, such as Zipf’s and Heaps’ law, and also ob- serve that the proportion of rare words in a pseudo-corpus affects the performance of its embeddings on word similarity.
Published: 2019
Full Text: View/download PDF

4. Back to the Future: Logic and Machine Learning

Author: Dobnik, Simon, Kelleher, John D., ADAPT Research Centre, and SFI Research Centres Programme
Subjects: computational linguistic, Computational Linguistics, logic, dialogue, formal approaches, machine learning, structure learning, Computer Sciences, deep learning, language technology, data driven approaches, spatial language
Abstract: In this paper we argue that since the beginning of the natural language processing or computational linguistics there has been a strong connection between logic and machine learning. First of all, there is something logical about language or linguistic about logic. Secondly, we argue that rather than distinguishing between logic and machine learning, a more useful distinction is between top-down approaches and data-driven approaches. Examining some recent approaches in deep learning we argue that they incorporate both properties and this is the reason for their very successful adoption to solve several problems within language technology.
Published: 2017

5. Towards a Computational Model of Frame of Reference Alignment in Swedish Dialogue

Author: Dobnik, Simon, Howes, Christine, Demaret, Kim, and Kelleher, John D.
Subjects: Computational Linguistics, Psycholinguistics and Neurolinguistics, Semantics and Pragmatics, Swedish Dialogue, Computer Sciences, Frame of Reference, Language Description and Documentation, Dialogue, Applied Linguistics, Dialogue Systems, Interpersonal and Small Group Communication, Spatial Language, Discourse and Text Linguistics
Abstract: In this paper we examine how people negotiate, interpret and repair the frame of reference (FoR) in online text based dialogues discussing spatial scenes in Swedish. We describe work-in-progress in which participants are given different perspectives of the same scene and asked to locate several objects that are only shown on one of their pictures. This task requires participants to coordinate on FoR in order to identify the missing objects. This study has implications for situated dialogue systems.
Published: 2016
Full Text: View/download PDF

6. Perception Based Misunderstandings in Human-Computer Dialogues

Author: Schütte, Niels, Kelleher, John D., and Mac Namee, Brian
Subjects: Situated Dialogue, Human-Computer Interaction, Computational Linguistics, Computer and Systems Architecture, InformationSystems_MODELSANDPRINCIPLES, Perception, Dialogue, Robotics, Dialogue Systems, Sensor Errors, Misunderstandings, Discourse and Text Linguistics
Abstract: In a situated dialogue, misunderstandings may arise if the participants perceive or interpret the environment in different ways. In human-computer dialogue this may be due the sensor errors. We present an experiment system and a series of experiments in which we investigate this problem.
Published: 2014

7. Proceedings of the Sixth International Natural Language Generation Conference (INLG 2010)

Author: Kelleher, John D., Mac Namee, Brian, and van der Sluis, Ielka
Subjects: Computational Linguistics, Natural Language Generation, Artificial Intelligence and Robotics, Cognition and Perception, Psycholinguistics and Neurolinguistics, Semantics and Pragmatics, Syntax, Discourse and Text Linguistics
Published: 2010
Full Text: View/download PDF

8. Referring Expression Generation Challenge 2008 DIT System Descriptions (DIT-FBI, DIT-TVAS, DIT-CBSR, DIT-RBR, DIT-FBI-CBSR, DIT-TVAS-RBR)

Author: Kelleher, John D. and Mac Namee, Brian
Subjects: Referring Expression Generation, Computational Linguistics, Natural Language Generation, Artificial Intelligence and Robotics
Abstract: This papers desibes a set of systems developed at DIT for the Referring Expression Generation challenage at INLG 2008.In Proceedings of the 5th International Natural Language Generation Conference (INLG-08)
Published: 2008
Full Text: View/download PDF

9. Frequency Based Incremental Attribute Selection for GRE

Author: Kelleher, John D.
Subjects: Referring Expression Generation, Computational Linguistics, Natural Language Generation, Artificial Intelligence and Robotics
Abstract: The DIT system uses an incremental greedy search to generate descriptions, similar to the incremental algorithm described in (Dale and Reiter, 1995). The selection of the next attribute to be tested for inclusion in the description is ordered by the absolute frequency of each attribute in the training corpus. Attributes are selected in descending order of frequency (i.e. the attribute that occurred most frequently in the training corpus is selected first). Where two or more attributes have the same frequency of occurrence the first attribute found with that frequency is selected. The type attribute is always included in the description. Other attributes are included in the description if they exclude at least 1 distractor from the set of distractors that fulfil the description generated prior that attribute’s selection.The algorithm terminates when a distinguishing description has been generated (i.e., all the distractors have been excluded) or when all the target’s attributes have been tested for inclusion in the description.
Published: 2007
Full Text: View/download PDF

10. Proceedings of the 4th ACL-SIGSEM Workshop on Prepositions at ACL-2007

Author: Costello, Fintan, Kelleher, John D., and Volk, Martin
Subjects: Computational Linguistics, surgical procedures, operative, Artificial Intelligence and Robotics, Psycholinguistics and Neurolinguistics, Semantics and Pragmatics, musculoskeletal, neural, and ocular physiology, education, Computation, Prepositions, musculoskeletal system, human activities, Semantics, Discourse and Text Linguistics
Abstract: This volume contains the papers presented at the Fourth ACL-SIGSEM Workshop on Prepositions. This workshop is endorsed by the ACL Special Interest Group on Semantics (ACL-SIGSEM), and is hosted in conjunction with ACL 2007, taking place on 28th June, 2007 in Prague, the Czech Republic.
Published: 2007

11. A perceptually based computational framework for the interpretation of spatial language

Author: Kelleher, John D., van Genabith, Josef, Costello, Fintan, and Humphreys, Mark
Subjects: Computational linguistics, Virtual reality, Computer simulation, Natural language processing
Abstract: The goal of this work is to develop a semantic framework to underpin the development of natural language (NL) interfaces for 3 Dimensional (3-D) simulated environments. The thesis of this work is that the computational interpretation of language in such environments should be based on a framework that integrates a model of visual perception with a model of discourse. When interacting with a 3-D environment, users have two main goals the first is to move around in the simulated environment and the second is to manipulate objects in the environment. In order to interact with an object through language, users need to be able to refer to the object. There are many different types of referring expressions including definite descriptions, pronominals, demonstratives, one-anaphora, other-expressions, and locative-expressions Some of these expressions are anaphoric (e g , pronominals, oneanaphora, other-expressions). In order to computationally interpret these, it is necessary to develop, and implement, a discourse model. Interpreting locative expressions requires a semantic model for prepositions and a mechanism for selecting the user’s intended frame of reference. Finally, many of these expressions presuppose a visual context. In order to interpret them this context must be modelled and utilised. This thesis develops a perceptually grounded discourse-based computational model of reference resolution capable of handling anaphoric and locative expressions. There are three novel contributions in this framework a visual saliency algorithm, a semantic model for locative expressions containing projective prepositions, and a discourse model. The visual saliency algorithm grades the prominence of the objects in the user's view volume at each frame. This algorithm is based on the assumption that objects which are larger and more central to the user's view are more prominent than objects which are smaller or on the periphery of their view. The resulting saliency ratings for each frame are stored in a data structure linked to the NL system’s context model. This approach gives the system a visual memory that may be drawn upon in order to resolve references. The semantic model for locative expressions defines a computational algorithm for interpreting locatives that contain a projective preposition. Specifically, the prepositions in front of behind, to the right of, and to the left of. There are several novel components within this model. First, there is a procedure for handling the issue of frame of reference selection. Second, there is an algorithm for modelling the spatial templates of projective prepositions. This algonthm integrates a topological model with visual perceptual cues. This approach allows us to correctly define the regions described by projective preposition in the viewer-centred frame of reference, in situations that previous models (Yamada 1993, Gapp 1994a, Olivier et al 1994, Fuhr et al 1998) have found problematic. Thirdly, the abstraction used to represent the candidate trajectors of a locative expression ensures that each candidate is ascribed the highest rating possible. This approach guarantees that the candidate trajector that occupies the location with the highest applicability in the prepositions spatial template is selected as the locative’s referent. The context model extends the work of Salmon-Alt and Romary (2001) by integrating the perceptual information created by the visual saliency algonthm with a model of discourse. Moreover, the context model defines an interpretation process that provides an explicit account of how the visual and linguistic information sources are utilised when attributing a referent to a nominal expression. It is important to note that the context model provides the set of candidate referents and candidate trajectors for the locative expression interpretation algorithm. These are restncted to those objects that the user has seen. The thesis shows that visual salience provides a qualitative control in NL interpretation for 3-D simulated environments and captures interesting and significant effects such as graded judgments. Moreover, it provides an account for how object occlusion impacts on the semantics of projective prepositions that are canonically aligned with the front-back axis in the viewer-centred frame of reference.
Published: 2003

12. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions

Author: Howcroft, David, Belz, Anya, Gkatzia, Dimitra, Clinciu, Miruna, Hasan, Sadid, Mahamood, Saad, Mille, Simon, van Miltenburg, Emiel, Santhanam, Sashank, Rieser, Verena, Language, Communication and Cognition, Davis, Brian, Graham, Yvette, Kelleher, John D., and Sripada, Yaji
Subjects: Computational linguistics
Abstract: Human assessment remains the most trusted form of evaluation in NLG, but highly diverse approaches and a proliferation of different quality criteria used by researchers make it difficult to compare results and draw conclusions across papers, with adverse implications for meta-evaluation and reproducibility. In this paper, we present (i) our dataset of 165 NLG papers with human evaluations, (ii) the annotation scheme we developed to label the papers for different aspects of evaluations, (iii) quantitative analyses of the annotations, and (iv) a set of recommendations for improving standards in evaluation reporting. We use the annotations as a basis for examining information included in evaluation reports, and levels of consistency in approaches, experimental design and terminology, focusing in particular on the 200+ different terms that have been used for evaluated aspects of quality. We conclude that due to a pervasive lack of clarity in reports and extreme diversity in approaches, human evaluation in NLG presents as extremely confused in 2020, and that the field is in urgent need of standard methods and terminology.
Published: 2020

13. Disentangling the properties of human evaluation methods: a classification system to support comparability, meta-evaluation and reproducibility testing

Author: Belz, Anya, Mille, Simon, Howcroft, David M., Davis, Brian, Graham, Yvette, and Kelleher, John D.
Subjects: Computational linguistics
Abstract: Current standards for designing and reporting human evaluations in NLP mean it is generally unclear which evaluations are comparable and can be expected to yield similar results when applied to the same system outputs. This has serious implications for reproducibility testing and meta-evaluation, in particular given that human evaluation is considered the gold standard against which the trustworthiness of automatic metrics is gauged. %and merging others, as well as deciding which evaluations should be able to reproduce each other’s results. Using examples from NLG, we propose a classification system for evaluations based on disentangling (i) what is being evaluated (which aspect of quality), and (ii) how it is evaluated in specific (a) evaluation modes and (b) experimental designs. We show that this approach provides a basis for determining comparability, hence for comparison of evaluations across papers, meta-evaluation experiments, reproducibility testing.
Published: 2020

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

13 results on '"KELLEHER, JOHN"'

1. Missing information, unresponsive authors, experimental flaws: the impossibility of assessing the reproducibility of previous human evaluations in NLP

2. English WordNet Taxonomic Random Walk Pseudo-Corpora

3. Synthetic, Yet Natural: Properties of WordNet Random Walk Corpora and the impact of rare words on embedding performance

4. Back to the Future: Logic and Machine Learning

5. Towards a Computational Model of Frame of Reference Alignment in Swedish Dialogue

6. Perception Based Misunderstandings in Human-Computer Dialogues

7. Proceedings of the Sixth International Natural Language Generation Conference (INLG 2010)

8. Referring Expression Generation Challenge 2008 DIT System Descriptions (DIT-FBI, DIT-TVAS, DIT-CBSR, DIT-RBR, DIT-FBI-CBSR, DIT-TVAS-RBR)

9. Frequency Based Incremental Attribute Selection for GRE

10. Proceedings of the 4th ACL-SIGSEM Workshop on Prepositions at ACL-2007

11. A perceptually based computational framework for the interpretation of spatial language

12. Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definitions

13. Disentangling the properties of human evaluation methods: a classification system to support comparability, meta-evaluation and reproducibility testing

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Database

Publisher

13 results on '"KELLEHER, JOHN"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources