2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference

Cong Li, Yihan Yin, Xintong Wu, Jingchen Zhu, Zhutianya Gao, Dimin Niu, Qiang Wu, Xin Si, Yuan Xie 0001, Chen Zhang, Guangyu Sun 0003. 2-LLM: Hardware-Dataflow Co-Exploration for Heterogeneous Hybrid-Bonding-based Low-Batch LLM Inference. In Proceedings of the 52nd Annual International Symposium on Computer Architecture, ISCA 2025, Tokyo, Japan, June 21-25, 2025. pages 194-210, ACM, 2025. [doi]

Authors

Cong Li

This author has not been identified. Look up 'Cong Li' in Google

Yihan Yin

This author has not been identified. Look up 'Yihan Yin' in Google

Xintong Wu

This author has not been identified. Look up 'Xintong Wu' in Google

Jingchen Zhu

This author has not been identified. Look up 'Jingchen Zhu' in Google

Zhutianya Gao

This author has not been identified. Look up 'Zhutianya Gao' in Google

Dimin Niu

This author has not been identified. Look up 'Dimin Niu' in Google

Qiang Wu

This author has not been identified. Look up 'Qiang Wu' in Google

Xin Si

This author has not been identified. Look up 'Xin Si' in Google

Yuan Xie 0001

This author has not been identified. Look up 'Yuan Xie 0001' in Google

Chen Zhang

This author has not been identified. Look up 'Chen Zhang' in Google

Guangyu Sun 0003

This author has not been identified. Look up 'Guangyu Sun 0003' in Google