Communication-aware Quantization for Deep Learning Inference Parallelization on Chiplet-based Accelerators

Kaiwei Zou, Songyun Qu, Wen Li, Ying Wang 0001, Huawei Li 0001, Yongpan Liu. Communication-aware Quantization for Deep Learning Inference Parallelization on Chiplet-based Accelerators. In 29th IEEE International Conference on Parallel and Distributed Systems, ICPADS 2023, Ocean Flower Island, China, December 17-21, 2023. pages 1123-1130, IEEE, 2023. [doi]

Abstract

Abstract is missing.