Back to Search Start Over

MasakhaNER: Named entity recognition for African languages

Authors :
Julia Kreutzer
Ayodele Awokoya
Ignatius Ezeani
Rubungo Andre Niyongabo
Happy Buzaaba
Adewale Akinfaderin
Samuel Oyerinde
Stephen Mayhew
Emmanuel Anebi
Mofetoluwa Adeyemi
Kelechi Ogueji
Abdoulaye Diallo
Seid Muhie Yimam
Jade Abbott
Joyce Nakatumba-Nabende
Victor Akinode
Blessing Sibanda
Catherine Gitau
Chester Palen-Michel
Shamsuddeen Hassan Muhammad
Degaga Wolde
Graham Neubig
Tendai Marengereke
Paul Rayson
Derguene Mbaye
Eric Peter Wairagala
Daniel D'souza
Tosin P. Adewumi
Jonathan Mukiibi
Chris Chinenye Emezue
David Ifeoluwa Adelani
Shruti Rijhwani
Iroro Orife
Verrah Otiende
Maurice Katusiime
Yvonne Wambui
Dibora Gebreyohannes
Kelechi Nwaike
Salomey Osei
Chiamaka Chukwuneke
Henok Tilaye
Deborah Nabagereka
Thierno Ibrahima Diop
Orevaoghene Ahia
Jesujoba O. Alabi
Sebastian Ruder
Davis David
Mouhamadane Mboup
Samba Ngom
Tajuddeen R. Gwadabe
Bonaventure F. P. Dossou
Temilola Oloyede
Perez Ogayo
Clemencia Siro
Gerald Muriuki
Aremu Anuoluwapo
Nkiruka Odu
Tobius Saul Bateesa
Abdoulaye Faye
Israel Abebe Azime
Constantine Lignos
Saarland University [Saarbrücken]
Masakhane NLP
Retro Rabbit
Carnegie Mellon University [Pittsburgh] (CMU)
ProQuest
Google Research
Brandeis University
Université de Tsukuba = University of Tsukuba
DeepMind
DeepMind Technologies
Duolingo
African Institute for Mathematical Sciences (AIMS)
University of Porto
Bayero University Kano (BUK)
Technische Universität Munchen - Université Technique de Munich [Munich, Allemagne] (TUM)
Makerere University [Kampala, Ouganda] (MAK)
African Leadership University
University of Lagos
Max Planck Institute for Informatics [Saarbrücken]
Universität Hamburg (UHH)
University of Chinese Academy of Sciences [Beijing] (UCAS)
Lancaster University
University of Electronic Science and Technology of China (UESTC)
United States International University - Africa
Niger-Volta Language Technologies Institute
Luleå University of Technology (LUT)
African University of Science and Technology (AUST)
University of Ibadan
Namibia University of Science and Technology (NUST)
InstaDeep
Jacobs University [Bremen]
University of Waterloo [Waterloo]
European Project: 825081,H2020,COMPRISE(2018)
Technical University of Munich (TUM)
DeepMind [London]
Universidade do Porto = University of Porto
University of Electronic Science and Technology of China [Chengdu] (UESTC)
Source :
Transactions of the Association for Computational Linguistics, Transactions of the Association for Computational Linguistics, The MIT Press, 2021, ⟨10.1162/tacl⟩, Transactions of the Association for Computational Linguistics, 2021, ⟨10.1162/tacl⟩
Publication Year :
2021
Publisher :
HAL CCSD, 2021.

Abstract

We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders. We detail characteristics of the languages to help researchers understand the challenges that these languages pose for NER. We analyze our datasets and conduct an extensive empirical evaluation of state-of-the-art methods across both supervised and transfer learning settings. We release the data, code, and models in order to inspire future research on African NLP.<br />Comment: Accepted to TACL 2021, pre-MIT Press publication version

Details

Language :
English
ISSN :
2307387X
Database :
OpenAIRE
Journal :
Transactions of the Association for Computational Linguistics, Transactions of the Association for Computational Linguistics, The MIT Press, 2021, ⟨10.1162/tacl⟩, Transactions of the Association for Computational Linguistics, 2021, ⟨10.1162/tacl⟩
Accession number :
edsair.doi.dedup.....d1930ae0735e2b37961957cb5eb49a8e
Full Text :
https://doi.org/10.1162/tacl⟩