nGPT: Normalized Transformer with Representation Learning on the Hypersphere

Ilya Loshchilov, Cheng-Ping Hsieh, Simeng Sun, Boris Ginsburg. nGPT: Normalized Transformer with Representation Learning on the Hypersphere. In The Thirteenth International Conference on Learning Representations, ICLR 2025, Singapore, April 24-28, 2025. OpenReview.net, 2025. [doi]

Abstract

Abstract is missing.