Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference

Jinghan Yao, Nawras Alnaasan, Tian Chen, Aamir Shafi, Hari Subramoni, Dhabaleswar K. D. K. Panda. Flover: A Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference. In 30th IEEE International Conference on High Performance Computing, Data, and Analytics, HiPC 2023, Goa, India, December 18-21, 2023. pages 107-116, IEEE, 2023. [doi]

Abstract

Abstract is missing.