1

Efficient MoE Serving in the Memory-Bound Regime: Balance Activated Experts, Not Tokens
GCS: Generalized cache coherence for efficient and scalable synchronization
An Efficient Data Structure for Dynamic Graph on GPUs
MIND: In-Network Memory Management for Disaggregated Data Centers