Publications

(2026). Efficient and Scalable Synchronization via Generalized Cache Coherence. OSDI'26.

(2025). Efficient MoE Serving in the Memory-Bound Regime: Balance Activated Experts, Not Tokens. In Submission.

PDF

(2025). CORD: Low-Latency, Bandwidth-Efficient and Scalable Release Consistency via Directory Ordering. ISCA'25 Distinguished Artifact Award. Selected for IEEE Micro’s Top Picks in Computer Architecture in 2025.

PDF Code

(2023). An Efficient Data Structure for Dynamic Graph on GPUs. TKDE'23.

PDF

(2021). MIND: In-Network Memory Management for Disaggregated Data Centers. SOSP'21.

PDF Code

(2013). An example conference paper. In ICW.

PDF Cite Project Slides