1. Disambiguating usernames across platforms: the GeekMAN approach.
- Author
-
Masud, Md Rayhanul, Treves, Ben, and Faloutsos, Michalis
- Abstract
How can we identify malicious hackers participating in different online platforms using their usernames only? Establishing the identity of a user across online platforms (e.g. security forums, GitHub, YouTube) is an essential capability for tracing malicious hackers. Although a hacker could pick arbitrary names, they often use the same or similar usernames as this helps them establish an online "brand". We propose GeekMAN, a systematic human-inspired approach to identify similar usernames across online platforms focusing on technogeek platforms. The key novelty consists of the development and integration of three capabilities: (a) decomposing usernames into meaningful chunks, (b) de-obfuscating technical and slang conventions, and (c) considering all the different outcomes of the two previous functions exhaustively when calculating the similarity. We conduct a study using 1.8M usernames from three different types of forums: (a) security forums, (b) malware authors from GitHub, and (c) mainstream social media platforms, which we use as reference. First, our method outperforms previous methods with a Precision of 81–86% on technogeek datasets. Second, we find 6327 forum users that match malware authors on GitHub with a high similarity score (≥ 0.7). Finally, we provide a translation dictionary for slang terms with 5.8K entries, and create GeekMAN platform to facilitate further studies https://geekman.streamlit.app. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF