Myeonggu Kang, Junyoung Park, Hyein Shin, Jaekang Shin, Lee-Sup Kim. ToEx: Accelerating Generation Stage of Transformer-Based Language Models via Token-Adaptive Early Exit. IEEE Transactions on Computers, 73(9):2248-2261, September 2024. [doi]
Abstract is missing.