1. Document-specific keyphrase candidate search and ranking.
- Author
-
Wang, Qingren, Sheng, Victor S., and Wu, Xindong
- Subjects
- *
KEYWORDS , *INDEXING , *SUBJECT headings , *NUMERICAL analysis , *MATHEMATICAL analysis - Abstract
This paper proposes an approach KeyRank to extract proper keyphrases from a document in English. It first searches all keyphrase candidates from the document, and then ranks them for selecting top- N ones as final keyphrases. Existing studies show that extracting a complete keyphrase candidate set that includes semantic relations in context, and evaluating the effectiveness of each candidate are crucial to extract high quality keyphrases from documents. Based on that words do not repeatedly appear in an effective keyphrase in English, a novel keyphrase candidate search algorithm using sequential pattern mining with gap constraints (called KCSP) is proposed to extract keyphrase candidates for KeyRank. And then an effectiveness evaluation measure pattern frequency with entropy (called PF-H) is proposed for KeyRank to rank these keyphrase candidates. Our experimental results show that KeyRank has better performance. Its first component KCSP is much more efficient than a closely related approach SPMW, and its second component PF-H is an effective evaluation mechanism for ranking keyphrase candidates. 1 1 Our two-page extended abstract is published in AAAI 2017 (Wang, Sheng, & Wu, 2017). [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF