Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models

Clara Na, Sanket Vaibhav Mehta, Emma Strubell. Train Flat, Then Compress: Sharpness-Aware Minimization Learns More Compressible Models. In Yoav Goldberg, Zornitsa Kozareva, Yue Zhang, editors, Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022. pages 4909-4936, Association for Computational Linguistics, 2022. [doi]

Abstract

Abstract is missing.