Explaining Language Model Predictions with High-Impact Concepts

Ruochen Zhao, Tan Wang, YongJie Wang, Shafiq Joty. Explaining Language Model Predictions with High-Impact Concepts. In Yvette Graham, Matthew Purver, editors, Findings of the Association for Computational Linguistics: EACL 2024, St. Julian's, Malta, March 17-22, 2024. pages 995-1012, Association for Computational Linguistics, 2024. [doi]

Abstract

Abstract is missing.