An Investigation of Language Model Interpretability via Sentence Editing

Samuel Stevens, Yu Su. An Investigation of Language Model Interpretability via Sentence Editing. In Jasmijn Bastings, Yonatan Belinkov, Emmanuel Dupoux, Mario Giulianelli, Dieuwke Hupkes, Yuval Pinter, Hassan Sajjad, editors, Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, BlackboxNLP@EMNLP 2021, Punta Cana, Dominican Republic, November 11, 2021. pages 435-446, Association for Computational Linguistics, 2021. [doi]

@inproceedings{StevensS21,
  title = {An Investigation of Language Model Interpretability via Sentence Editing},
  author = {Samuel Stevens and Yu Su},
  year = {2021},
  url = {https://aclanthology.org/2021.blackboxnlp-1.34},
  researchr = {https://researchr.org/publication/StevensS21},
  cites = {0},
  citedby = {0},
  pages = {435-446},
  booktitle = {Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, BlackboxNLP@EMNLP 2021, Punta Cana, Dominican Republic, November 11, 2021},
  editor = {Jasmijn Bastings and Yonatan Belinkov and Emmanuel Dupoux and Mario Giulianelli and Dieuwke Hupkes and Yuval Pinter and Hassan Sajjad},
  publisher = {Association for Computational Linguistics},
  isbn = {978-1-955917-06-3},
}