Gradient Co-occurrence Analysis for Detecting Unsafe Prompts in Large Language Models

Jingyuan Yang 0008, Bowen Yan, Rongjun Li, Ziyu Zhou, Xin Chen, Zhiyong Feng, Wei Peng 0011. Gradient Co-occurrence Analysis for Detecting Unsafe Prompts in Large Language Models. In Xian-Ling Mao, Zhaochun Ren, Muyun Yang, editors, Natural Language Processing and Chinese Computing - 14th National CCF Conference, NLPCC 2025, Urumqi, China, August 7-9, 2025, Proceedings, Part II. Volume 16103 of Lecture Notes in Computer Science, pages 181-192, Springer, 2025. [doi]

Abstract

Abstract is missing.