Back to Search Start Over

English–Vietnamese cross-language paraphrase identification using hybrid feature classes

Authors :
Nguyen Le Thanh
Dien Dinh
Source :
Journal of Heuristics. 28:193-209
Publication Year :
2019
Publisher :
Springer Science and Business Media LLC, 2019.

Abstract

Paraphrase identification plays an important role with various applications in natural language processing tasks such as machine translation, bilingual information retrieval, plagiarism detection, etc. With the development of information technology and the Internet, the requirement of textual comparing is not only in the same language but also in many different language pairs. Especially in Vietnamese, detecting paraphrase in the English–Vietnamese pair of sentences is a high demand because English is one of the most popular foreign languages in Vietnam. However, the in-depth studies on cross- language paraphrase identification tasks between English and Vietnamese are still limited. Therefore, in this paper, we propose a method to identify the English–Vietnamese cross-language paraphrase cases, using hybrid feature classes. These classes are calculated by using the fuzzy-based method as well as the siamese recurrent model, and then combined to get the final result with a mathematical formula. The experimental results show that our model achieves 87.4% F-measure accuracy.

Details

ISSN :
15729397 and 13811231
Volume :
28
Database :
OpenAIRE
Journal :
Journal of Heuristics
Accession number :
edsair.doi...........b786f53cccc81861b46e3ccd19342e5d