Fast Large Language Model Collaborative Decoding via Speculation

Jiale Fu, Yuchu Jiang, JunKai Chen, Jiaming Fan, Xin Geng 0001, Xu Yang 0021. Fast Large Language Model Collaborative Decoding via Speculation. In Forty-second International Conference on Machine Learning, ICML 2025, Vancouver, BC, Canada, July 13-19, 2025. OpenReview.net, 2025. [doi]

Abstract

Abstract is missing.