1. An effective representation learning model for link prediction in heterogeneous information networks.
- Author
-
Kumar, Vishnu and Krishna, P. Radha
- Subjects
- *
INFORMATION networks , *RANDOM walks , *PREDICTION models , *SEMANTICS - Abstract
Heterogeneous Information Networks (HINs) consist of multiple categories of nodes and edges and encompass rich semantic information. Representing HINs in a low-dimensional feature space is challenging due to its complex structure and rich semantics. In this paper, we focus on link prediction and node classification by learning efficient low-dimensional feature representations of HINs. Metapath-guided walkers have been extensively studied in the literature for learning feature representations. However, the metapath walker does not control the length of random walks, resulting in weak structural and semantic information embeddings. In this work, we present an influence propagation controlled metapath-guided random walk model (called IPCMetapath2Vec) for representation learning in HINs. The model works in three phases: first, we perform node transition to generate a metapath-guided random walk, which is conditioned on two factors: (i) type mapping of the next node according to the metapath, and (ii) compute influence propagation score for each node and detect potential influencers on the walk by a threshold based filter. Next, we provide the collected random walks as input to the skip-gram model to learn each node's feature representation. Lastly, we employ an attention mechanism that aggregates the learned feature representations of each node from various semantic metapath-guided walks, preserving the importance of different semantics. We use these network representation features to address link prediction and multi-label node classification tasks. Experimental results on two public HIN datasets, namely DBLP and IMDB, show that our model outperforms the state-of-the-art representation learning models such as DeepWalk, Node2vec, Metapath2Vec, and HIN2Vec by 4.5% to 17.2% in terms of micro-F1 score for multi-label node classification and 4% to 14.50% in terms of AUC-ROC score for link prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF