On the impact of pretraining data ordering in transformer encoder- and decoder-only language models

Luca Dini, Lucia Domenichelli, Dominique Brunato, Felice dell'Orletta. On the impact of pretraining data ordering in transformer encoder- and decoder-only language models. Knowl.-Based Syst., 342:115850, 2026. [doi]

Abstract

Abstract is missing.