SweEval: Do LLMs Really Swear? A Safety Benchmark for Testing Limits for Enterprise Use

Hitesh Laxmichand Patel, Amit Agarwal, Arion Das, Bhargava Kumar, Srikant Panda, Priyaranjan Pattnayak, Taki Hasan Rafi, Tejaswini Kumar, Dong-Kyu Chae. SweEval: Do LLMs Really Swear? A Safety Benchmark for Testing Limits for Enterprise Use. In Weizhu Chen, Yi Yang, Mohammad Kachuee, Xue-Yong Fu, editors, Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2025 - Volume 3: Industry Track, Albuquerque, New Mexico, USA, April 30, 2025. pages 558-582, Association for Computational Linguistics, 2025. [doi]

Abstract

Abstract is missing.