Efficient GPT Model Pre-training using Tensor Train Matrix Representation

Viktoria Chekalina, Georgiy Novikov, Julia Gusak, Alexander Panchenko, Ivan V. Oseledets. Efficient GPT Model Pre-training using Tensor Train Matrix Representation. In Chu-Ren Huang, Yasunari Harada, Jong-Bok Kim, Si Chen, Yu-Yin Hsu, Emmanuele Chersoni, Pranav A, Winnie Huiheng Zeng, Bo Peng, Yuxi Li, Junlin Li, editors, Proceedings of the 37th Pacific Asia Conference on Language, Information and Computation, PACLIC 2023, The Hong Kong Polytechnic University, Hong Kong, SAR, China, 2-4 December 2023. pages 600-608, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.