Sub 4-bit Power-of-Two-Based Mixed-Precision Quantization for Efficient LLM Compression and Acceleration

Han Cho, Apurba Prasad Padhy, Fernando Camacho, Saibal Mukhopadhyay. Sub 4-bit Power-of-Two-Based Mixed-Precision Quantization for Efficient LLM Compression and Acceleration. IEEE Access, 13:209356-209367, 2025. [doi]

References

No references recorded for this publication.

Cited by

No citations of this publication recorded.