Fast Large Language Model Collaborative Decoding via Speculation

Jiale Fu, Yuchu Jiang, JunKai Chen, Jiaming Fan, Xin Geng 0001, Xu Yang 0021. Fast Large Language Model Collaborative Decoding via Speculation. In Forty-second International Conference on Machine Learning, ICML 2025, Vancouver, BC, Canada, July 13-19, 2025. OpenReview.net, 2025. [doi]

Authors

Jiale Fu

This author has not been identified. Look up 'Jiale Fu' in Google

Yuchu Jiang

This author has not been identified. Look up 'Yuchu Jiang' in Google

JunKai Chen

This author has not been identified. Look up 'JunKai Chen' in Google

Jiaming Fan

This author has not been identified. Look up 'Jiaming Fan' in Google

Xin Geng 0001

This author has not been identified. Look up 'Xin Geng 0001' in Google

Xu Yang 0021

This author has not been identified. Look up 'Xu Yang 0021' in Google