PurPL

Haoteng Yin
February 13, 2025

Learning on graphs has driven advances in scientific discovery, business modeling, and AI-assisted decision-making. However, two fundamental challenges hinder its broader adoption: (1) the scalability of expressive learning methods, particularly subgraph-based approaches, and (2) the need for privacy-preserving techniques to protect sensitive relational data. Both challenges stem from the intricate dependencies in graph structures, making traditional algorithmic and system-level optimizations insufficient. In this talk, I will present a unified framework that addresses both scalability and privacy challenges in graph learning through algorithm and system co-design. For scalability, I introduce a family of efficient and expressive graph representation learning frameworks—leveraging walk-, set-, and hash-based representations—that eliminate the redundancy in canonical subgraph-based methods, enabling large-scale learning on billion-edge graphs. This same co-design principle extends to privacy-preserving graph learning, where structural dependencies in relational data violate the assumptions of standard privacy mechanisms such as differentially private stochastic gradient descent (DP-SGD). To address this, I propose a novel training framework that disentangles these dependencies, enabling the application of fine-tuning large language models on graph data in sensitive domains while maintaining robust privacy guarantees. By jointly designing learning algorithms and system implementations, these solutions demonstrate how graph-based AI can be both efficient and trustworthy, unlocking new opportunities for learning from complex structured data in real-world applications.

About Haoteng Yin

Haoteng Yin is a Ph.D. candidate in Computer Science at Purdue University, advised by Prof. Pan Li. His research focuses on developing efficient, scalable, and trustworthy learning algorithms for graphs and foundational models for complex structured data. He has been recognized with the Best Paper Award at SMP ’16, the Best Poster Award at the 3rd RCAC Cybersecurity Symposium, the Top Reviewer Award at NeurIPS ’22, and the Purdue Teaching Academy Graduate Teaching Award in 2024.

Algorithm-System Co-Design for Scalable and Privacy-Preserving Graph Learning

About Haoteng Yin