Back to Search
Start Over
Improving bilingual word embeddings mapping with monolingual context information
- Source :
- Machine Translation. 35:503-518
- Publication Year :
- 2021
- Publisher :
- Springer Science and Business Media LLC, 2021.
-
Abstract
- Bilingual word embeddings (BWEs) play a very important role in many natural language processing (NLP) tasks, especially cross-lingual tasks such as machine translation (MT) and cross-language information retrieval. Most existing methods to train BWEs are based on bilingual supervision. However, bilingual resources are not available for many low-resource language pairs. Although some studies addressed this issue with unsupervised methods, monolingual contextual data are not used to improve the performance of low-resource BWEs. To address these issues, we propose an unsupervised method to improve BWEs using optimized monolingual context information without any parallel corpora. In particular, we first build a bilingual word embeddings mapping model between two languages by aligning monolingual word embedding spaces based on unsupervised adversarial training. To further improve the performance of these mappings, we use monolingual context information to optimize them during the course. Experimental results show that our method outperforms other baseline systems significantly, including results for four low-resource language pairs.
- Subjects :
- Linguistics and Language
Word embedding
Computer science
business.industry
InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL
Context (language use)
computer.software_genre
Language and Linguistics
Parallel corpora
ComputingMethodologies_PATTERNRECOGNITION
Contextual design
Artificial Intelligence
Artificial intelligence
Computational linguistics
business
computer
Software
Word (computer architecture)
Natural language processing
Subjects
Details
- ISSN :
- 15730573 and 09226567
- Volume :
- 35
- Database :
- OpenAIRE
- Journal :
- Machine Translation
- Accession number :
- edsair.doi...........1ebef003306df88be09996cfb8d9497a
- Full Text :
- https://doi.org/10.1007/s10590-021-09274-0