Atticus Geiger, Duligur Ibeling, Amir Zur, Maheep Chaudhary, Sonakshi Chauhan, Jing Huang 0014, Aryaman Arora, Zhengxuan Wu, Noah D. Goodman, Christopher Potts, Thomas Icard. Causal Abstraction: A Theoretical Foundation for Mechanistic Interpretability. Journal of Machine Learning Research, 26, 2025. [doi]
Abstract is missing.