Hanlin Zhang, Depen Morwani, Nikhil Vyas 0001, Jingfeng Wu, Difan Zou, Udaya Ghai, Dean P. Foster, Sham M. Kakade. How Does Critical Batch Size Scale in Pre-training?. In The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. [doi]
Abstract is missing.