Back to Search Start Over

Mining User-Aware Rare Sequential Topic Patterns in Document Streams.

Authors :
Zhu, Jiaqi
Wang, Kaijun
Wu, Yunkun
Hu, Zhongyi
Wang, Hongan
Source :
IEEE Transactions on Knowledge & Data Engineering. 7/1/2016, Vol. 28 Issue 7, p1790-1804. 15p.
Publication Year :
2016

Abstract

Textual documents created and distributed on the Internet are ever changing in various forms. Most of existing works are devoted to topic modeling and the evolution of individual topics, while sequential relations of topics in successive documents published by a specific user are ignored. In this paper, in order to characterize and detect personalized and abnormal behaviors of Internet users, we propose Sequential Topic Patterns (STPs) and formulate the problem of mining User-aware Rare Sequential Topic Patterns (URSTPs) in document streams on the Internet. They are rare on the whole but relatively frequent for specific users, so can be applied in many real-life scenarios, such as real-time monitoring on abnormal user behaviors. We present a group of algorithms to solve this innovative mining problem through three phases: preprocessing to extract probabilistic topics and identify sessions for different users, generating all the STP candidates with (expected) support values for each user by pattern-growth, and selecting URSTPs by making user-aware rarity analysis on derived STPs. Experiments on both real (Twitter) and synthetic datasets show that our approach can indeed discover special users and interpretable URSTPs effectively and efficiently, which significantly reflect users’ characteristics. [ABSTRACT FROM PUBLISHER]

Details

Language :
English
ISSN :
10414347
Volume :
28
Issue :
7
Database :
Academic Search Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
116115944
Full Text :
https://doi.org/10.1109/TKDE.2016.2541149