Acceleration of Large Transformer Model Training by Sensitivity-Based Layer Dropping

Yujie Zeng, Wenlong He, Ihor Vasyltsov, Jiali Pang, Lin Chen. Acceleration of Large Transformer Model Training by Sensitivity-Based Layer Dropping. In Brian Williams 0001, Yiling Chen 0001, Jennifer Neville, editors, Thirty-Seventh AAAI Conference on Artificial Intelligence, AAAI 2023, Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence, IAAI 2023, Thirteenth Symposium on Educational Advances in Artificial Intelligence, EAAI 2023, Washington, DC, USA, February 7-14, 2023. pages 11156-11163, AAAI Press, 2023. [doi]

Authors

Yujie Zeng

This author has not been identified. Look up 'Yujie Zeng' in Google

Wenlong He

This author has not been identified. Look up 'Wenlong He' in Google

Ihor Vasyltsov

This author has not been identified. Look up 'Ihor Vasyltsov' in Google

Jiali Pang

This author has not been identified. Look up 'Jiali Pang' in Google

Lin Chen

This author has not been identified. Look up 'Lin Chen' in Google