DataMan: Data Manager for Pre-training Large Language Models

Ru Peng, Kexin Yang 0002, Yawen Zeng, Junyang Lin, Dayiheng Liu, Junbo Zhao 0002. DataMan: Data Manager for Pre-training Large Language Models. In The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. [doi]

Abstract

Abstract is missing.