Is Safer Better? The Impact of Guardrails on the Argumentative Strength of LLMs in Hate Speech Countering

Helena Bonaldi, Greta Damo, Nicolas Ocampo, Elena Cabrio, Serena Villata, Marco Guerini. Is Safer Better? The Impact of Guardrails on the Argumentative Strength of LLMs in Hate Speech Countering. In Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen, editors, Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, EMNLP 2024, Miami, FL, USA, November 12-16, 2024. pages 3446-3463, Association for Computational Linguistics, 2024. [doi]

Abstract

Abstract is missing.