Back to Search Start Over

Identifying Users across Different Sites using Usernames.

Authors :
Wang, Yubin
Liu, Tingwen
Tan, Qingfeng
Shi, Jinqiao
Guo, Li
Source :
Procedia Computer Science; 2016, Vol. 80, p376-385, 10p
Publication Year :
2016

Abstract

Identifying users across different sites is to find the accounts that belong to the same individual. The problem is fundamental and important, and its results can benefit many applications such as social recommendation. Observing that 1) usernames are essential elements for all sites; 2) most users have limited number of usernames on the Internet; 3) usernames carries information that reflect an individual’s characteristics and habits etc., this paper tries to identify users based on username similarity. Specifically, we introduce the self-information vector model to integrate our proposed content and pattern features extracted from usernames into vectors. In this paper, we define two usernames’ similarity as the cosine similarity between their self-information vectors. We further propose an abbreviation detection method to discover the initialism phenomenon in usernames, which can improve our user identification results. Experimental results on real-world username sets show that we can achieve 86.19% precision rate, 68.53% recall rate and 76.21% F1-measure in average, which is better than the state-of-the-art work. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
18770509
Volume :
80
Database :
Supplemental Index
Journal :
Procedia Computer Science
Publication Type :
Academic Journal
Accession number :
115845015
Full Text :
https://doi.org/10.1016/j.procs.2016.05.336