Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Yichao Fu, Peter Bailis, Ion Stoica, Hao Zhang 0108. Break the Sequential Dependency of LLM Inference Using Lookahead Decoding. In Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. OpenReview.net, 2024. [doi]

@inproceedings{FuBS024,
  title = {Break the Sequential Dependency of LLM Inference Using Lookahead Decoding},
  author = {Yichao Fu and Peter Bailis and Ion Stoica and Hao Zhang 0108},
  year = {2024},
  url = {https://openreview.net/forum?id=eDjvSFOkXw},
  researchr = {https://researchr.org/publication/FuBS024},
  cites = {0},
  citedby = {0},
  booktitle = {Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024},
  publisher = {OpenReview.net},
}