Safeguarding Language Models via Self-Destruct Trapdoor

Shahar Katz, Bar Alon, Ariel Shaulov, Lior Wolf, Mahmood Sharif. Safeguarding Language Models via Self-Destruct Trapdoor. In Vera Demberg, Kentaro Inui, Lluís Marquez, editors, Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2026 - Volume 1: Long Papers, Rabat, Morocco, March 24-29, 2026. pages 6939-6958, Association for Computational Linguistics, 2026. [doi]

@inproceedings{KatzASWS26,
  title = {Safeguarding Language Models via Self-Destruct Trapdoor},
  author = {Shahar Katz and Bar Alon and Ariel Shaulov and Lior Wolf and Mahmood Sharif},
  year = {2026},
  url = {https://aclanthology.org/2026.eacl-long.326/},
  researchr = {https://researchr.org/publication/KatzASWS26},
  cites = {0},
  citedby = {0},
  pages = {6939-6958},
  booktitle = {Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2026 - Volume 1: Long Papers, Rabat, Morocco, March 24-29, 2026},
  editor = {Vera Demberg and Kentaro Inui and Lluís Marquez},
  publisher = {Association for Computational Linguistics},
  isbn = {979-8-89176-380-7},
}