Back to Search
Start Over
Introducing RONEC -- the Romanian Named Entity Corpus
- Publication Year :
- 2019
-
Abstract
- We present RONEC - the Named Entity Corpus for the Romanian language. The corpus contains over 26000 entities in ~5000 annotated sentences, belonging to 16 distinct classes. The sentences have been extracted from a copy-right free newspaper, covering several styles. This corpus represents the first initiative in the Romanian language space specifically targeted for named entity recognition. It is available in BRAT and CoNLL-U Plus formats, and it is free to use and extend at github.com/dumitrescustefan/ronec .<br />Comment: 8 pages + annex, accepted to LREC2020 in the main conference
- Subjects :
- Computer Science - Computation and Language
Subjects
Details
- Database :
- arXiv
- Publication Type :
- Report
- Accession number :
- edsarx.1909.01247
- Document Type :
- Working Paper