Handling heavy-tailed input of transformer inference on GPUs

Jiangsu Du, Jiazhi Jiang, Yang You, Dan Huang, Yutong Lu. Handling heavy-tailed input of transformer inference on GPUs. In Lawrence Rauchwerger, Kirk W. Cameron, Dimitrios S. Nikolopoulos, Dionisios N. Pnevmatikatos, editors, ICS '22: 2022 International Conference on Supercomputing, Virtual Event, June 28 - 30, 2022. ACM, 2022. [doi]

Abstract

Abstract is missing.