Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes

Cheng-Yu Hsieh, Chun-Liang Li, Chih-Kuan Yeh, Hootan Nakhost, Yasuhisa Fujii, Alex Ratner, Ranjay Krishna, Chen-Yu Lee, Tomas Pfister. Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes. In Anna Rogers, Jordan L. Boyd-Graber, Naoaki Okazaki, editors, Findings of the Association for Computational Linguistics: ACL 2023, Toronto, Canada, July 9-14, 2023. pages 8003-8017, Association for Computational Linguistics, 2023. [doi]

Authors

Cheng-Yu Hsieh

This author has not been identified. Look up 'Cheng-Yu Hsieh' in Google

Chun-Liang Li

This author has not been identified. Look up 'Chun-Liang Li' in Google

Chih-Kuan Yeh

This author has not been identified. Look up 'Chih-Kuan Yeh' in Google

Hootan Nakhost

This author has not been identified. Look up 'Hootan Nakhost' in Google

Yasuhisa Fujii

This author has not been identified. Look up 'Yasuhisa Fujii' in Google

Alex Ratner

This author has not been identified. Look up 'Alex Ratner' in Google

Ranjay Krishna

This author has not been identified. Look up 'Ranjay Krishna' in Google

Chen-Yu Lee

This author has not been identified. Look up 'Chen-Yu Lee' in Google

Tomas Pfister

This author has not been identified. Look up 'Tomas Pfister' in Google