Runtime Deep Model Multiplexing for Reduced Latency and Energy Consumption Inference

Amir Erfan Eshratifar, Massoud Pedram. Runtime Deep Model Multiplexing for Reduced Latency and Energy Consumption Inference. In 38th IEEE International Conference on Computer Design, ICCD 2020, Hartford, CT, USA, October 18-21, 2020. pages 263-270, IEEE, 2020. [doi]

Abstract

Abstract is missing.