LLM-FP4: 4-Bit Floating-Point Quantized Transformers

Shih-Yang Liu, Zechun Liu, Xijie Huang, Pingcheng Dong, Kwang-Ting Cheng. LLM-FP4: 4-Bit Floating-Point Quantized Transformers. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023. pages 592-605, Association for Computational Linguistics, 2023. [doi]

@inproceedings{LiuLHDC23,
  title = {LLM-FP4: 4-Bit Floating-Point Quantized Transformers},
  author = {Shih-Yang Liu and Zechun Liu and Xijie Huang and Pingcheng Dong and Kwang-Ting Cheng},
  year = {2023},
  url = {https://aclanthology.org/2023.emnlp-main.39},
  researchr = {https://researchr.org/publication/LiuLHDC23},
  cites = {0},
  citedby = {0},
  pages = {592-605},
  booktitle = {Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023},
  editor = {Houda Bouamor and Juan Pino 0001 and Kalika Bali},
  publisher = {Association for Computational Linguistics},
  isbn = {979-8-89176-060-8},
}