Back to Search Start Over

Improving bilingual word embeddings mapping with monolingual context information

Authors :
Fuhua Zhang
Tianqi Li
Chenggang Mi
Zhifeng Zhang
Shaolin Zhu
Yu Sun
Source :
Machine Translation. 35:503-518
Publication Year :
2021
Publisher :
Springer Science and Business Media LLC, 2021.

Abstract

Bilingual word embeddings (BWEs) play a very important role in many natural language processing (NLP) tasks, especially cross-lingual tasks such as machine translation (MT) and cross-language information retrieval. Most existing methods to train BWEs are based on bilingual supervision. However, bilingual resources are not available for many low-resource language pairs. Although some studies addressed this issue with unsupervised methods, monolingual contextual data are not used to improve the performance of low-resource BWEs. To address these issues, we propose an unsupervised method to improve BWEs using optimized monolingual context information without any parallel corpora. In particular, we first build a bilingual word embeddings mapping model between two languages by aligning monolingual word embedding spaces based on unsupervised adversarial training. To further improve the performance of these mappings, we use monolingual context information to optimize them during the course. Experimental results show that our method outperforms other baseline systems significantly, including results for four low-resource language pairs.

Details

ISSN :
15730573 and 09226567
Volume :
35
Database :
OpenAIRE
Journal :
Machine Translation
Accession number :
edsair.doi...........1ebef003306df88be09996cfb8d9497a
Full Text :
https://doi.org/10.1007/s10590-021-09274-0