Publications
Yanpeng Yu\*, Haiyue Ma*, (*equal contribution), Krish Agarwal, Nicolai Oswald, Qijing Huang, Hugo Linsenmaier, Chunhui Mei, Ritchie Zhao, Ritika Borkar, Bita Rouhani, David Nellans, Ronny Krashinsky, Anurag Khandelwal
(2025).
Efficient MoE Serving in the Memory-Bound Regime: Balance Activated Experts, Not Tokens.
In Submission.