Back to Search Start Over

Evaluating language models for the retrieval and categorization of lexical collocations

Authors :
Joan Codina-Filbà
Leo Wanner
Luis Espinosa Anke
Source :
EACL
Publication Year :
2021
Publisher :
ACL (Association for Computational Linguistics), 2021.

Abstract

Comunicació presentada a: EACL 2021 celebrat del 19 a 23 d'abril de 2021 en línia. Lexical collocations are idiosyncratic combinations of two syntactically bound lexical items (e.g., “heavy rain”, “take a step” or “undergo surgery”). Understanding their degree of compositionality and idiosyncrasy, as well their underlying semantics, is crucial for language learners, lexicographers and downstream NLP applications alike. In this paper we analyse a suite of language models for collocation understanding. We first construct a dataset of apparitions of lexical collocations in context, categorized into 16 representative semantic categories. Then, we perform two experiments: (1) unsupervised collocate retrieval, and (2) supervised collocation classification in context. We find that most models perform well in distinguishing light verb constructions, especially if the collocation’s first argument acts as a subject, but often fail to distinguish, first, different syntactic structures within the same semantic category, and second, finer-grained categories which restrict the set of correct collocates. This work was partially supported by the European Commission via its H2020 Program under the contract number 870930.

Details

Language :
English
Database :
OpenAIRE
Journal :
EACL
Accession number :
edsair.doi.dedup.....8c0cdb3cee6414a9097703d72ab964e2