Back to Search Start Over

Old Permic Universal Dependencies Treebank

Authors :
Niko Partanen
Jack Rueter
Rogier Blokland
Source :
Journal of Data Mining and Digital Humanities, Vol NLP4DH (2024)
Publication Year :
2024
Publisher :
Nicolas Turenne, 2024.

Abstract

Old Permic, also known as Old Komi, is an extinct variety of Komi that was spoken in the late Middle Ages in the lower Vychegda river basin in Northeastern European Russia, in an area that currently is not Komi-speaking. This language variety is attested in fragmentary records from the 14th to 17th century written both in the Old Permic alphabet and in Cyrillic. These records are of significant importance for research on the history of the Komi language. Here we introduce our attempt towards a new Universal Dependencies treebank that will eventually contain the existing corpus of Old Permic in a structured and CoNLL-U annotated format. This will be the first time this material is being made openly available in digital format, and our contribution describes the current state of the art and remaining challenges.

Details

Language :
English
ISSN :
24165999
Volume :
NLP4DH
Database :
Directory of Open Access Journals
Journal :
Journal of Data Mining and Digital Humanities
Publication Type :
Academic Journal
Accession number :
edsdoj.0a3f51d26b544d1994bf3631e33fef0d
Document Type :
article
Full Text :
https://doi.org/10.46298/jdmdh.13306