Back to Search
Start Over
Word-to-word Machine Translation: Bilateral Similarity Retrieval for Mitigating Hubness
- Source :
- IOP Conference Series: Materials Science and Engineering. 533:012051
- Publication Year :
- 2019
- Publisher :
- IOP Publishing, 2019.
-
Abstract
- Nearest neighbor search is playing a critical role in machine word translation, due to its ability to obtain the lingual labels of source word embeddings by searching k Nearest Neighbor ( k NN) target embeddings from a shared bilingual semantic space. However, aligning two language distributions into a shared space usually requires amounts of target label, and k NN retrieval causes hubness problem in high-dimensions feature space. Although most the best-k retrievals get rid of hubs in the list of translation candidates to mitigate the hubness problem, it is flawed to eliminate hubs. Because hub also has a correct source word query corresponding to it and should not be crudely excluded. In this paper, we introduce an unsupervised machine word translation model based on Generative Adversarial Nets (GANs) with Bilingual Similarity retrieval, namely, Unsupervised-BSMWT. Our model addresses three main challenges: (1) reduce the dependence of parallel data with GANs in a fully unsupervised way. (2) Significantly decrease the training time of adversarial game. (3) Propose a novel Bilingual Similarity retrieval for mitigating hubness pollution regardless of whether it is a hub. Our model efficiently performs competitive results in 74min exceeding previous GANs-based models.
Details
- ISSN :
- 1757899X and 17578981
- Volume :
- 533
- Database :
- OpenAIRE
- Journal :
- IOP Conference Series: Materials Science and Engineering
- Accession number :
- edsair.doi...........36f89e7e2401d9d4d835fb516aa0bd10
- Full Text :
- https://doi.org/10.1088/1757-899x/533/1/012051