Back to Search Start Over

Heterogeneous Network Crawling: Reaching Target Nodes by Motif-Guided Navigation.

Authors :
Wang, Changyu
Chang, Kevin Chen-Chuan
Wang, Pinghui
Qin, Tao
Guan, Xiaohong
Source :
IEEE Transactions on Knowledge & Data Engineering. Sep2022, Vol. 34 Issue 9, p4285-4297. 13p.
Publication Year :
2022

Abstract

With numerous nodes on online heterogeneous networks, how to reach and extract target nodes of our specific interests is a pressing problem. In this paper, we propose a novel heterogeneous network crawler, MCrawl. It addresses the problem via iterative online heterogeneous network crawling by navigating its available APIs, starting from a set of target nodes, i.e., seed nodes. We are facing two challenges towards addressing the problem. First, to navigate within a vast network, how do we start from a small set of target nodes? In other words, which nodes in the “current frontier” and which direction shall we expand, to reach promising target nodes quickly? We propose motif-based crawling to exploit the complex structures and rich semantics of heterogeneous networks. Second, in many scenarios, we do not have a classifier to assess the quality of the harvested nodes and thus the motifs to expand. We develop a probabilistic inference framework to estimate the yield and harvest rates of motifs, achieving principled bootstrapping for crawling. Our experiment on real networks of MCrawl achieves significant margins over baselines. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
10414347
Volume :
34
Issue :
9
Database :
Academic Search Index
Journal :
IEEE Transactions on Knowledge & Data Engineering
Publication Type :
Academic Journal
Accession number :
158405969
Full Text :
https://doi.org/10.1109/TKDE.2020.3038458