Back to Search Start Over

HamleDT: Harmonized multi-language dependency treebank.

Authors :
Zeman, Daniel
Dušek, Ondřej
Mareček, David
Popel, Martin
Ramasamy, Loganathan
Štěpánek, Jan
Žabokrtský, Zdeněk
Hajič, Jan
Source :
Language Resources & Evaluation. Dec2014, Vol. 48 Issue 4, p601-637. 37p.
Publication Year :
2014

Abstract

We present HamleDT-a HArmonized Multi-LanguagE Dependency Treebank. HamleDT is a compilation of existing dependency treebanks (or dependency conversions of other treebanks), transformed so that they all conform to the same annotation style. In the present article, we provide a thorough investigation and discussion of a number of phenomena that are comparable across languages, though their annotation in treebanks often differs. We claim that transformation procedures can be designed to automatically identify most such phenomena and convert them to a unified annotation style. This unification is beneficial both to comparative corpus linguistics and to machine learning of syntactic parsing. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
1574020X
Volume :
48
Issue :
4
Database :
Academic Search Index
Journal :
Language Resources & Evaluation
Publication Type :
Academic Journal
Accession number :
99708564
Full Text :
https://doi.org/10.1007/s10579-014-9275-2