Headless Language Models: Learning without Predicting with Contrastive Weight Tying

Nathan Godey, Éric Villemonte de la Clergerie, Benoît Sagot. Headless Language Models: Learning without Predicting with Contrastive Weight Tying. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024. [doi]

Abstract

Abstract is missing.