"That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Edoardo Mosca, Shreyash Agarwal, Javier Rando-Ramirez, Georg Groh. "That Is a Suspicious Reaction!": Interpreting Logits Variation to Detect NLP Adversarial Attacks. In Smaranda Muresan, Preslav Nakov, Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022. pages 7806-7816, Association for Computational Linguistics, 2022. [doi]

This author has not been identified. Look up 'Edoardo Mosca' in GoogleThis author has not been identified. Look up 'Shreyash Agarwal' in GoogleThis author has not been identified. Look up 'Javier Rando-Ramirez' in GoogleThis author has not been identified. Look up 'Georg Groh' in Google

runs on WebDSL