An Adaptive Learning Method for Solving the Extreme Learning Rate Problem of Transformer

Jianbang Ding, Xuancheng Ren, Ruixuan Luo. An Adaptive Learning Method for Solving the Extreme Learning Rate Problem of Transformer. In Fei Liu, Nan Duan, Qingting Xu, Yu Hong, editors, Natural Language Processing and Chinese Computing - 12th National CCF Conference, NLPCC 2023, Foshan, China, October 12-15, 2023, Proceedings, Part I. Volume 14302 of Lecture Notes in Computer Science, pages 361-372, Springer, 2023. [doi]

Abstract

Abstract is missing.