Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies

Nadav Timor, Jonathan Mamou, Daniel Korat, Moshe Berchansky, Gaurav Jain, Oren Pereg, Moshe Wasserblat, David Harel. Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies. In Forty-second International Conference on Machine Learning, ICML 2025, Vancouver, BC, Canada, July 13-19, 2025. OpenReview.net, 2025. [doi]

Authors

Nadav Timor

This author has not been identified. Look up 'Nadav Timor' in Google

Jonathan Mamou

This author has not been identified. Look up 'Jonathan Mamou' in Google

Daniel Korat

This author has not been identified. Look up 'Daniel Korat' in Google

Moshe Berchansky

This author has not been identified. Look up 'Moshe Berchansky' in Google

Gaurav Jain

This author has not been identified. Look up 'Gaurav Jain' in Google

Oren Pereg

This author has not been identified. Look up 'Oren Pereg' in Google

Moshe Wasserblat

This author has not been identified. Look up 'Moshe Wasserblat' in Google

David Harel

This author has not been identified. It may be one of the following persons: Look up 'David Harel' in Google