loading page

Weighted Sampling based Large-scale Enclosing Subgraphs Embedding for Link Prediction
  • Ganglin Hu
Ganglin Hu
Chongqing College of Electronic Engineering

Corresponding Author:[email protected]

Author Profile

Abstract

Link prediction is a fundamental problem for graphs, which can reveal the potential relationships between users. Graph embedding can easily encode graph structural relations, and heterogeneous attribute features in a continuous vector space, which is effective in link prediction. However, graph embedding methods for large-scale graphs suffer high computation and space costs, and sampling enclosing subgraphs is a practical yet efficient way to obtain the most features at the least cost. Nevertheless, the existing sampling techniques may lose essential features when the random sampling number of nodes is not large, as node features are assumed to follow a uniform distribution. In this paper, we propose a novel enclosing subgraph embedding model named Weighted Sampling Enclosing-subgraph Embedding (WSEE) to resolve this issue, which maximumly preserves the structural and attribute features of enclosing subgraphs with less sampling. More specifically, we first extract the feature importance of each node in an enclosing subgraph and then take the node importance as node weight. Then, random walks node sequences are obtained by multiple weighted random walks from a target pair of nodes, generating a weighted sampling enclosing subgraph. By leveraging the weighted sampling enclosing subgraph, WSEE can scale to larger graphs with much less overhead while maintaining some essential information of the original graph. Experiments on real-world datasets demonstrate that our model can scale to larger graphs with acceptable overhead while link prediction performance is unaffected.