Shuzhang Zhong, Yanfan Sun, Ling Liang, Runsheng Wang, Ru Huang 0001, Meng Li 0004. HybriMoE: Hybrid CPU-GPU Scheduling and Cache Management for Efficient MoE Inference. In 62nd ACM/IEEE Design Automation Conference, DAC 2025, San Francisco, CA, USA, June 22-25, 2025. pages 1-7, IEEE, 2025. [doi]
Abstract is missing.