GENTI: GPU-powered Walk-based Subgraph Extraction for Scalable Representation Learning on Dynamic Graphs

Abstract

Graph representation learning is an emerging task for effectively embedding graph-structured data with learned features. Among them, Subgraph-based GRL (SGRL) methods have proven better scalability and expressiveness for large-scale GRL tasks. The core challenge of applying SGRL to dynamic graphs lies in accommodating the extraction of subgraphs to evolving data with efficient computation. To address the efficiency bottleneck, we propose GENTI, a GPU-oriented SGRL algorithm for dynamic graphs. Our approach mainly improves the critical subgraph extraction stage by disentangling it into two phases, namely neighbor sampling and subgraph gathering, which are respectively performed on CPU and GPU in an asynchronous fashion. The design favorably eliminates the dependence of feature learning on subgraph extraction, and is capable of exploiting the GPU batch processing ability to remarkably boost computations throughout the pipeline. Dedicated data structures are specifically designed for efficiently managing the dynamic graph storage and conforming efficient subgraph operations. Extensive empirical results on various real-world dynamic graphs show that GENTI achieves up to 30 times faster in subgraph extraction time than the state-of-the-art walk-based methods and up to 26 times acceleration in overall learning time, while maintaining comparable prediction performance. In particular, it is able to complete learning on the largest available graph of 1.3 billion edges within 24 hours, while all other baselines exhibit prohibitive overhead.

Publication
In Proceedings of the VLDB Endowment 2024, Vol 17
Yu Zihao
Yu Zihao
Ph.D Student