Back to Search Start Over

Word-to-word Machine Translation: Bilateral Similarity Retrieval for Mitigating Hubness

Authors :
Pu Haibo
Linchao He
Fei Han
Mengting Luo
Dejun Zhang
Mingyue Guo
Long Tian
Source :
IOP Conference Series: Materials Science and Engineering. 533:012051
Publication Year :
2019
Publisher :
IOP Publishing, 2019.

Abstract

Nearest neighbor search is playing a critical role in machine word translation, due to its ability to obtain the lingual labels of source word embeddings by searching k Nearest Neighbor ( k NN) target embeddings from a shared bilingual semantic space. However, aligning two language distributions into a shared space usually requires amounts of target label, and k NN retrieval causes hubness problem in high-dimensions feature space. Although most the best-k retrievals get rid of hubs in the list of translation candidates to mitigate the hubness problem, it is flawed to eliminate hubs. Because hub also has a correct source word query corresponding to it and should not be crudely excluded. In this paper, we introduce an unsupervised machine word translation model based on Generative Adversarial Nets (GANs) with Bilingual Similarity retrieval, namely, Unsupervised-BSMWT. Our model addresses three main challenges: (1) reduce the dependence of parallel data with GANs in a fully unsupervised way. (2) Significantly decrease the training time of adversarial game. (3) Propose a novel Bilingual Similarity retrieval for mitigating hubness pollution regardless of whether it is a hub. Our model efficiently performs competitive results in 74min exceeding previous GANs-based models.

Details

ISSN :
1757899X and 17578981
Volume :
533
Database :
OpenAIRE
Journal :
IOP Conference Series: Materials Science and Engineering
Accession number :
edsair.doi...........36f89e7e2401d9d4d835fb516aa0bd10
Full Text :
https://doi.org/10.1088/1757-899x/533/1/012051