Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity

Michael R. Metel, Peng Lu, Boxing Chen, Mehdi Rezagholizadeh, Ivan Kobyzev. Draft on the Fly: Adaptive Self-Speculative Decoding using Cosine Similarity. In Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen, editors, Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, Florida, USA, November 12-16, 2024. pages 2267-2272, Association for Computational Linguistics, 2024. [doi]

Abstract

Abstract is missing.