Wenjing Ke, Zhe Li, Dong Li, Lu Tian, Emad Barsoum. DL-QAT: Weight-Decomposed Low-Rank Quantization-Aware Training for Large Language Models. In Franck Dernoncourt, Daniel Preotiuc-Pietro, Anastasia Shimorina, editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: EMNLP 2024 - Industry Track, Miami, Florida, USA, November 12-16, 2024. pages 113-119, Association for Computational Linguistics, 2024. [doi]
Abstract is missing.