Back to Search
Start Over
Neural machine translation with a polysynthetic low resource language
- Source :
- Machine Translation. 34:325-346
- Publication Year :
- 2020
- Publisher :
- Springer Science and Business Media LLC, 2020.
-
Abstract
- Low-resource languages (LRL) with complex morphology are known to be more difficult to translate in an automatic way. Some LRLs are particularly more difficult to translate than others due to the lack of research interest or collaboration. In this article, we experiment with a specific LRL, Quechua, that is spoken by millions of people in South America yet has not undertaken a neural approach for translation until now. We improve the latest published results with baseline BLEU scores using the state-of-the-art recurrent neural network approaches for translation. Additionally, we experiment with several morphological segmentation techniques and introduce a new one in order to decompose the language’s suffix-based morphemes. We extend our work to other high-resource languages (HRL) like Finnish and Spanish to show that Quechua, for qualitative purposes, can be considered compatible with and translatable into other major European languages with measurements comparable to the state-of-the-art HRLs at this time. We finalize our work by making our best two Quechua–Spanish translation engines available on-line.
- Subjects :
- Linguistics and Language
Machine translation
Low resource
Computer science
business.industry
02 engineering and technology
computer.software_genre
Language and Linguistics
Recurrent neural network
Artificial Intelligence
Morpheme
0202 electrical engineering, electronic engineering, information engineering
020201 artificial intelligence & image processing
Artificial intelligence
Suffix
Computational linguistics
business
Baseline (configuration management)
computer
Morphological segmentation
Software
Natural language processing
Subjects
Details
- ISSN :
- 15730573 and 09226567
- Volume :
- 34
- Database :
- OpenAIRE
- Journal :
- Machine Translation
- Accession number :
- edsair.doi...........0b6b818909047afd0545b10cf191430e
- Full Text :
- https://doi.org/10.1007/s10590-020-09255-9