Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations - researchr publication

researchr

You are not signed in
Sign in
Sign up

Yanda Chen, Ruiqi Zhong, Narutatsu Ri, Chen Zhao, He He 0001, Jacob Steinhardt, Zhou Yu, Kathleen R. McKeown. Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations. In Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. OpenReview.net, 2024. [doi]

Abstract is missing.

runs on WebDSL