Back to Search Start Over

Key word extraction for short text via word2vec, doc2vec, and textrank.

Authors :
Jun LI
Guimin HUANG
Chunli FAN
Zhenglin SUN
Hongtao ZHU
Source :
Turkish Journal of Electrical Engineering & Computer Sciences. 2019, Vol. 27 Issue 3, p1794-1805. 12p.
Publication Year :
2019

Abstract

The rapid development of social media encourages people to share their opinions and feelings on the Internet. Every day, a large number of short text comments are generated through Twitter, microblogging, WeChat, etc., and there is high commercial and social value in extracting useful information from these short texts. At present, most studies have focused on extracting text key words. For example, the LDA topic model has good performance with long texts, but it loses effectiveness with short texts because of the noise and sparsity problems. In this paper, we attempt to use Word2Vec and Doc2Vec to improve short-text key word extraction. We first added the method of the collaborative training of word vectors and paragraph vectors and then used the TextRank model's clustering nodes. We adjusted the weights of the key words that were generated by computing the jump probability between nodes and then obtained the node-weighted score, and eventually sorted the generated key words. The experimental results show that the improved method has good performance on the datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
13000632
Volume :
27
Issue :
3
Database :
Academic Search Index
Journal :
Turkish Journal of Electrical Engineering & Computer Sciences
Publication Type :
Academic Journal
Accession number :
136928441
Full Text :
https://doi.org/10.3906/elk-1806-38