1. Romanian Lexical Resources Interconnection.
- Author
-
Scutelnicu, Liviu Andrei
- Subjects
ROMANIAN language ,ROMANIANS ,ELECTRONIC dictionaries - Abstract
Great efforts are being made to increase the utility of linguistic resources in applications related to language processing by interconnecting them. This study is focused on the Romanian language, which is a young language in this domain, for which it has to be made big steps until it is considered well resourced - both qualitatively and quantitatively. The CoRoLa corpus has been developed for 4 years and since 2017 it is visible for research. It cumulates about one billion words, annotated (tokenized, lemmatized, morphologically and syntactically processed). Other two resources used in this study are the eDTLR (the electronic version of the Thesaurus Dictionary of the Romanian Language) and Romanian WordNet. In this study it is described a technology that unifies these resources and brings them to a common standard, which leads the way for an efficient coupling of these resources. With this new obtained standardized resource, a series of test and case studies were performed, with the benefit of a much more diverse exposure on using the Romanian language. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF