1. LongEval: Longitudinal Evaluation of Model Performance at CLEF 2023
- Author
-
Rabab Alkhalifa, Iman Bilal, Hsuvas Borkakoty, Jose Camacho-Collados, Romain Deveaud, Alaa El-Ebshihy, Luis Espinosa-Anke, Gabriela Gonzalez-Saez, Petra Galuščáková, Lorraine Goeuriot, Elena Kochkina, Maria Liakata, Daniel Loureiro, Harish Tayyar Madabushi, Philippe Mulhem, Florina Piroi, Martin Popel, Christophe Servan, Arkaitz Zubiaga, Queen Mary University of London (QMUL), University of Dammam - Imam Abdulrahman Bin Faisal University, University of Warwick [Coventry], Cardiff University, QWANT enterprise, Modélisation et Recherche d’Information Multimédia [Grenoble] (MRIM ), Laboratoire d'Informatique de Grenoble (LIG), Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA)-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes (UGA)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP ), Université Grenoble Alpes (UGA), The Alan Turing Institute, University of Bath [Bath], Fakultät für Mathematik und Geoinformation [Wien] (TU Wien), Vienna University of Technology (TU Wien), Charles University [Prague] (CU), Research Studios Austria, QWANT, Sciences et Technologies des Langues (STL), Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS), and ANR-19-CE23-0029,Kodicare,Variations de l'environnement d'évaluation : caractérisation du delta et impact sur l'évolution continue des systèmes de recherche d'information(2019)
- Subjects
Temporal Generalisability ,[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR] ,Information Retrieval ,Evaluation ,Temporal Persistence ,Text Classification - Abstract
International audience; In this paper, we describe the plans for the first LongEval CLEF 2023 shared task dedicated to evaluating the temporal persistence of Information Retrieval (IR) systems and Text Classifiers. The task is motivated by recent research showing that the performance of these models drops as the test data becomes more distant, with respect to time, from the training data. LongEval differs from traditional shared IR and classification tasks by giving special consideration to evaluating models aiming to mitigate performance drop over time. We envisage that this task will draw attention from the IR community and NLP researchers to the problem of temporal persistence of models, what enables or prevents it, potential solutions and their limitations.
- Published
- 2023
- Full Text
- View/download PDF