Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Yichao Fu, Peter Bailis, Ion Stoica, Hao Zhang 0108. Break the Sequential Dependency of LLM Inference Using Lookahead Decoding. In Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. OpenReview.net, 2024. [doi]

Authors

Yichao Fu

This author has not been identified. Look up 'Yichao Fu' in Google

Peter Bailis

This author has not been identified. Look up 'Peter Bailis' in Google

Ion Stoica

This author has not been identified. Look up 'Ion Stoica' in Google

Hao Zhang 0108

This author has not been identified. Look up 'Hao Zhang 0108' in Google