InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning

Zhexin Zhang, Jiale Cheng, Hao Sun, Jiawen Deng, Minlie Huang. InstructSafety: A Unified Framework for Building Multidimensional and Explainable Safety Detector through Instruction Tuning. In Houda Bouamor, Juan Pino 0001, Kalika Bali, editors, Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 6-10, 2023. pages 10421-10436, Association for Computational Linguistics, 2023. [doi]

Abstract

Abstract is missing.