Sub 4-bit Power-of-Two-Based Mixed-Precision Quantization for Efficient LLM Compression and Acceleration

Han Cho, Apurba Prasad Padhy, Fernando Camacho, Saibal Mukhopadhyay. Sub 4-bit Power-of-Two-Based Mixed-Precision Quantization for Efficient LLM Compression and Acceleration. IEEE Access, 13:209356-209367, 2025. [doi]

@article{ChoPCM25,
  title = {Sub 4-bit Power-of-Two-Based Mixed-Precision Quantization for Efficient LLM Compression and Acceleration},
  author = {Han Cho and Apurba Prasad Padhy and Fernando Camacho and Saibal Mukhopadhyay},
  year = {2025},
  doi = {10.1109/ACCESS.2025.3625771},
  url = {https://doi.org/10.1109/ACCESS.2025.3625771},
  researchr = {https://researchr.org/publication/ChoPCM25},
  cites = {0},
  citedby = {0},
  journal = {IEEE Access},
  volume = {13},
  pages = {209356-209367},
}