No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models - researchr publication

researchr

You are not signed in
Sign in
Sign up

Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu 0003, Jianfeng Gao, Weizhu Chen, Tuo Zhao. No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. [doi]

Abstract is missing.

runs on WebDSL