Back to Search
Start Over
MasakhaNER: Named entity recognition for African languages
- Source :
- Transactions of the Association for Computational Linguistics, Transactions of the Association for Computational Linguistics, The MIT Press, 2021, ⟨10.1162/tacl⟩, Transactions of the Association for Computational Linguistics, 2021, ⟨10.1162/tacl⟩
- Publication Year :
- 2021
- Publisher :
- HAL CCSD, 2021.
-
Abstract
- We take a step towards addressing the under-representation of the African continent in NLP research by creating the first large publicly available high-quality dataset for named entity recognition (NER) in ten African languages, bringing together a variety of stakeholders. We detail characteristics of the languages to help researchers understand the challenges that these languages pose for NER. We analyze our datasets and conduct an extensive empirical evaluation of state-of-the-art methods across both supervised and transfer learning settings. We release the data, code, and models in order to inspire future research on African NLP.<br />Comment: Accepted to TACL 2021, pre-MIT Press publication version
- Subjects :
- FOS: Computer and information sciences
Linguistics and Language
Computer Science - Computation and Language
Computer science
business.industry
Computer Science - Artificial Intelligence
Communication
Languages of Africa
computer.software_genre
Code (semiotics)
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
Computer Science Applications
Human-Computer Interaction
Artificial Intelligence (cs.AI)
Named-entity recognition
Artificial Intelligence
Artificial intelligence
Transfer of learning
business
computer
Computation and Language (cs.CL)
Natural language processing
Subjects
Details
- Language :
- English
- ISSN :
- 2307387X
- Database :
- OpenAIRE
- Journal :
- Transactions of the Association for Computational Linguistics, Transactions of the Association for Computational Linguistics, The MIT Press, 2021, ⟨10.1162/tacl⟩, Transactions of the Association for Computational Linguistics, 2021, ⟨10.1162/tacl⟩
- Accession number :
- edsair.doi.dedup.....d1930ae0735e2b37961957cb5eb49a8e
- Full Text :
- https://doi.org/10.1162/tacl⟩