Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders

Mathis Le Bail, Jérémie Dentan, Davide Buscaldi, Sonia Vanier. Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders. In Vera Demberg, Kentaro Inui, Lluís Marquez, editors, Findings of the Association for Computational Linguistics: EACL 2026, Rabat, Morocco, March 24-29, 2026. pages 2477-2504, Association for Computational Linguistics, 2026. [doi]

@inproceedings{BailDBV26,
  title = {Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders},
  author = {Mathis Le Bail and Jérémie Dentan and Davide Buscaldi and Sonia Vanier},
  year = {2026},
  url = {https://aclanthology.org/2026.findings-eacl.129/},
  researchr = {https://researchr.org/publication/BailDBV26},
  cites = {0},
  citedby = {0},
  pages = {2477-2504},
  booktitle = {Findings of the Association for Computational Linguistics: EACL 2026, Rabat, Morocco, March 24-29, 2026},
  editor = {Vera Demberg and Kentaro Inui and Lluís Marquez},
  publisher = {Association for Computational Linguistics},
  isbn = {979-8-89176-386-9},
}