Chen Liang, Haoming Jiang, Simiao Zuo, Pengcheng He, Xiaodong Liu 0003, Jianfeng Gao, Weizhu Chen, Tuo Zhao. No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. [doi]