Improving Pretraining Data Using Perplexity Correlations

Tristan Thrush, Christopher Potts, Tatsunori Hashimoto. Improving Pretraining Data Using Perplexity Correlations. In The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. [doi]

Abstract

Abstract is missing.